VexU Legal Vision Tracking with Luxinos Oak-1 Camera

Hello everyone,

It’s been a while since I’ve been active on here… and since I’ve competed… But during the pandemic I became interested in computer vision and got the chance to back the Luxinos kickstarter for the Oak-1 camera. I quickly realized the potential usefulness for this device because of its small size and all computations are done on board so there is no need for large GPU’s like a Jetson. Unlike the current Vex Vision sensor, which is just a reskinned pixy, these cameras run at high resolutions and framerates. They also let you train your own custom models which give the flexibility for them to work in different environments. All these features are the basis of a more robust vision system that could be more comparable to something you would see in FRC.

Last summer I began playing around with getting these cameras to communicate with the V5 brain by using a Raspberry Pi to communicate through the micro USB port on the brain. Some of you might have seen those demo’s but here they are:

At the beginning of this year I decided to start a project to actually implement this onto something that would be more applicable to VRC and be a fun experiment for applying custom fabrication, physics, and programming into a presentable project. I decided to make a flywheel turret robot that would shoot the Turning Point plastic 3" balls into a goal around 5’ high from anywhere on the field, taking a lot of inspiration from the FRC game Infinite Recharge.

I designed and built this on and off during the semester and now that school is ended I got a chance to program and test it to a point that I am happy to share with everyone.

Unfortunately right now the code is not in a state where I want to release it publicly, and until that is done I won’t be working on any documentation on how the control system works for tracking the targets and how to communicate from the camera to the brain, but I WILL eventually be making all that public, I was just too excited to share this to wait until all of that is done.

I’m happy to answer any questions anyone has about it in the meantime, and I hope this possibly inspires some VexU teams to pursue using higher level vision targeting in future games :slight_smile:

Huge shout outs to:

  • Tomas Verardi from 375X for some of the Cad elements
  • Jess Zarchi for helping with parts of the design
  • Charlie Grier for helping with parts of the design
  • Andrew Strauss for helping me with Raspberry Pi to V5 Brain communication
  • Jared from 254’s presentation on vision tracking
62 Likes

In addition to the spectacular vision targeting can I just say this robot is incredibly well put together.

11 Likes

Nice! but where did you park your Lambo ?

1 Like

Hello @Zach_929U,
Thanks again for sharing this on our Discord! And since you mentioned FRC, OAK-D has already been used there as well! See video here and the image below. Since we are trying to focus on robotic vision - are there any suggestions you could provide to improve our platform (depthai), or any painpoints you have experienced during the development of this awesome robot?
Thanks again, Erik (Luxonis)

5 Likes

Thanks for reaching out Erik,

The process was pretty smooth to get everything working. If you’re looking for suggestions to widen the user base though, an easier process for getting annoted datasets into trained models that can be deployed might be useful (I’m sure you are already working on this or it already exists and I just couldn’t find it).

From what I was able to find online the easiest ways for someone to go about this would be with the Roboflow Collab notebooks put out a couple years ago or use the built in roboflow train (you only get 3 credits though). I’m sure you can share some better resources you know of though.

Sadly the Collab notebook by Roboflow is quite outdated and parts of it also seem broken so I followed what was outlined here to write a python script that may help some people training locally on their computers that should take some of the headache of finding cudatool versions for pytorch and model conversions away.

Note, I threw this together in the last day so excuse how gross some parts of it are… and the lack of documentation. This was more just a proof of concept that I could get it to work but I thought I’d share anyways.

6 Likes

Feel free to ridicule this, I finally cleaned up the code a bit and organized it here:

I’d really like to make an explanation video behind the math of aiming and latency correction on the camera but I got covid at the end of summer and now my semester has started so I probably won’t get around to it until winter.

11 Likes

Hey Zach, this is super super cool work, thanks for sharing!

I have a few questions and comments. I skimmed the code a bit but didn’t deep dive, so apologies for any “read the code” moments.

Accounting for camera latency is a really nice move.

  • How consistent was the camera latency (so, “jitter”, if that’s a familiar term)?
  • How did you measure the camera latency or verify it was accurate?
  • Could you describe the turrets performance increase from applying latency compensation?
  • Did you consider accounting for other system latencies (computation, actuation, etc.)?
  • What computation (if any) was distributed on to the Pi vs the V5 Brain?
  • We’re you ever able to gauge the communication latency between the Pi and the V5 System to compare it to pure latency from the camera?

I see some feed forward information is being included in the turret aiming.

  • How is that integrated with the current drive command or trajectory?
  • Does the turret controller have reference velocities or accelerations included (maybe from the expected base trajectory)?
  • If yes to the previous, did you apply that reference trajectory to the flywheel speed control as well?

Overall, super cool stuff. I’ve done latency compensation for autonomous RC cars before, so I know it can be tricky. One technique we learned to measure overall latency was to command a sinusoid to the motors (could be speed or accel command) and then measure the speed or accel from your sensors. If you plot the reference command against the measurement, you get two sinusoids, where one is phase shifted by your overall latency.

Another cool thing I’ve been playing with lately is using a Real-time Linux kernel on my Jetson Nano (or RPi if that’s your vibe). I found latency and jitter decreased from 200us on average with max of 10ms all the way down to like 10us, max 60us. Just some fun anecdotes if you’re into that kinda thing.

Code looks nice. If you are shopping for suggestions, the first thing I do when I see new code is beeline to the header files, where I expect to see the comments and explanations for any public interfaces I might use. That way I don’t have to dig through source to get an idea of what’s up.

Again, 10/10 work. Robot is gorgeous. Thanks so much for sharing with everyone.

6 Likes

Thank you for the praise and suggestions for programming, I have done backend python for my previous jobs so I am not super familiar with more professional C/C++ formatting, but I’ll make sure to remember that for next time.

How consistent was the camera latency?

Using the python library for the Luxinos Cameras they provide built in functions to help you measure the Camera’s latency (Examples on how to use that here). In my early proof of concepts in the summer of 2021 I was using YOLO v3 to train the model and was getting ~50 ms of latency, but after switching to YOLO v5 I noticed that it went down to around 30ms (both at 1080p ~30fps). The 10 microseconds you are getting on your Jetson Nano are extremely impressive, and likely would not require any latency compensation for an application like this.

Could you describe the turrets performance increase from applying latency compensation?

Baked into the latency compensation is other ‘safety’ that uses the measured robots position with odometry, more importantly than latency compensation what this helped account for was lost frames caused by vibrations in the robot (I intentionally isolated the camera on thin polycarb to act as a dampener for flywheel vibration but overall robot shaking from movements still caused issues) or motion blur/game objects blocking the cameras vision of the target over small intervals of time.

I can tell you 100% that with the set up that I am using on this robot you will not be able to get anything remotely close without latency/frame loss compensation. I’ll go into a high level overview of how that works since you have other questions about it as well:

The first important thing to understand is how to calculate the distance from the target to the robot from the image. Because in this situation the height of the target is known and constant, you are able to use the pinhole model of a camera to simplify it down to simple right angle trigonometry using the resolution and FOV of the camera.

Latency compensation becomes important to use because as you can imaging the robot has moved between the time the image was taken and the time that the image was processed. So, the calculated position of the target using the method above does not reflect the state of the robot in the real world. (picture below)


Using this, what I do is take a new reference frame of the robots position every time a frame is captured by the camera, while that image is processing it will update the position of the robot in that reference frame, then when it is finally calculated you can use triangle math (law of sines/law of cosine) to create a third vector from the robots current position to the target which is what you should be aiming at right now.

Hopefully that makes some sense.

On top of this though, it is easy to imagine how you account for lost frames as well. If the object is not detected in the image when it is done being processed, you can use the calculated vector to the goal from a previous frame as a replacement to the actual camera reading. In essence, all this does is make the distance “Delta S” from the above image a longer value.

This creates a robust control system because the odometry is being used as a backbone to the camera, and the camera is used to reduce distance that odometry has to be accurate for in order to know exactly where the target is.

Did you consider accounting for other system latencies (computation, actuation, etc.)?

There is feed forwards built into the turret aiming that attempts to adjust for the velocity of the drive train but it is very basic. I would really like to make it able to shoot while driving around which would require a robust model for handling this, but other problems i’ll mention later are really what prevent this from being possible.

One thing that I did not account for was the acceleration of the turret itself being an issue, because the gyroscopic force of the flywheel and weight of the turret itself, the acceleration from being still was a headache to try to account for, there were points where I was using piecewise functions to scale error at lower RPM’s in attempts to make it accelerate faster but a tuned PID loop ended up working the most consistently across all robot speeds. If I were to rebuild this though I would definitely put 2 motors on the turret just to eliminate the issue all together.

What computation (if any) was distributed on to the Pi vs the V5 Brain?

The Pi handled exclusively the image processing, it would send the coordinates of the object to the V5 brain, then all the math/latency correction was done on the brain along with the other robot functions.

We’re you ever able to gauge the communication latency between the Pi and the V5 System to compare it to pure latency from the camera?

I never actually measured it, but it never was an issue so I assume that it is negligible (I am only sending a 6 character long string between the Pi and the Brain)

did you apply that reference trajectory to the flywheel speed control as well?

There is no flywheel speed control, the hood itself is on a curved rack and pinion system to adjust for the distance change. To answer the question though, it was baked into all the latency stuff mentioned above. Here is a video of the hood:
IMG_4837_AdobeExpress

Conclusion

The biggest problem I ran into with all of this, which I did assume would be an issue but decided to just ignore it, was that I was using the drive motor encoders for odometry instead of tracking wheels or any other form of position tracking. This lead to error in the calculations when the wheels slipped during fast accelerations or pushing against the wall. With tracking wheels I think this would work significantly better to the point where you could pursuit driving and shooting at any speed at the same time, which I would love to see someone do.

Hope this answered all your questions, if you have any more don’t hesitate asking. Good luck this season! :wink:

13 Likes

Awesome stuff! That pretty much answers my questions, but now I have a few more… :thinking:

In my early proof of concepts in the summer of 2021 I was using YOLO v3 to train the model and was getting ~50 ms of latency, but after switching to YOLO v5 I noticed that it went down to around 30ms (both at 1080p ~30fps).

I haven’t done any ML for image processing before, so I’m not familiar with YOLO beyond a quick google.

  • How difficult would you say it was to train your object detector and how robust was it?
  • Did you have to make your own dataset of a considerable size?
  • What were training times like?
  • Was it robust to different lighting conditions (i.e. would you ever recommend for use in real competition)?

The 10 microseconds you are getting on your Jetson Nano are extremely impressive, and likely would not require any latency compensation for an application like this.

As a side note, I should point out for anyone reading and interested that the numbers I provided for the Jetson represents measurements of the amount of time between when a thread should wake up, and when it actually wakes up. Using an RTOS or an RT Linux Kernel is typically a good move for robotics on a Jetson or a Pi, but will not magically fix the sensor latencies. The whole process Zach went through is extremely necessary when you want to really control something well. I’m also no RTOS expert, so if anyone needs to correct me, do it gently :laughing:.

One thing that I did not account for was the acceleration of the turret itself being an issue, because the gyroscopic force of the flywheel and weight of the turret itself, the acceleration from being still was a headache to try to account for, there were points where I was using piecewise functions to scale error at lower RPM’s in attempts to make it accelerate faster but a tuned PID loop ended up working the most consistently across all robot speeds. If I were to rebuild this though I would definitely put 2 motors on the turret just to eliminate the issue all together.

This is really interesting to me. I haven’t taken Dynamics in awhile, but Is the gyroscopic torque directly opposing the turrets rotation, or is the gyroscopic torque + the weight causing extra friction / binding in the turret mechanism?

Such clean engineering omg. :heart_eyes:

For you, anything. Just stay tuned… :ghost:

7 Likes

Haha sure thing!

How difficult would you say it was to train your object detector and how robust was it?

With the documentation that is out there it is out there for ML image processing it is not that hard. Because the OAK cameras use intel’s Myriad X architecture though you have to convert your models into the .blob format.

I know that on the Oak forums Luxinos mentioned that they were in the process of making a tool to help users go through this but I don’t think that it is out right now so you can either use the tool I posted in an earlier post in this thread, or one of the many Google Collab notebooks across the internet.

Did you have to make your own dataset of a considerable size?

As mentioned in the earlier post as well, I used Roboflow in order to create and annotate the dataset, which I recommend to anyone as a free tool that you can use for basically any project outside of this because of the number of formats they support for automatic exporting. For this project I used about ~200 pictures of the target from different distances, angles, and lighting conditions.

What were training times like?

Not long at all, I would say somewhere between 60-90 minutes on final version (high epoch) of the model that I wanted to but you can pump out a 90+ % confidence model within a 15-20 minutes for testing if you have a good enough Nvidia GPU. (I have a 1080Ti for reference)

Was it robust to different lighting conditions (i.e. would you ever recommend for use in real competition)

You’re able to get away with smaller datasets that are pretty easy to gather because YOLO applies its own set of distortions to the dataset you give it to create more training data and therefor create robust models. Without trying to hard to account for bad lightning, the models that I used were pretty robust and never failed because I was using it in a different room. The only times that it stopped working were in extreme cases of glare on the target and very dark rooms.

I am confident that this could work in practice at competitions as almost every event I’ve been too has had normal lightning. I think my biggest fear with taking it to a competition depends on what you would be tracking, depending how you are using it I could see objects on other fields being detected and messing up your system, something to think about when designing/choosing the angle of your camera.

This is really interesting to me. I haven’t taken Dynamics in awhile, but Is the gyroscopic torque directly opposing the turrets rotation, or is the gyroscopic torque + the weight causing extra friction / binding in the turret mechanism?

Correct, the gyroscopic force of the flywheel is opposing the rotational force from turret, intuitively you can think of the turret rotating like trying to push down on the edge of a spinning top. In a game like Spin up this would be way less of a problem since the wheels are sideways though.

It does not create binding because it is has decent tolerances and it’s all on ball bearings, but it does create resistance to motion for sure. pics below:



Can’t wait to see what you guys do :smiley:

10 Likes

Sorry to bump this thread again, Here are some renders of the CAD model that I thought I would share though!

As always, let me know if you have any questions!


:slight_smile:

18 Likes