Hydraulix989 6 months ago

I'm confused. The demo doesn't seem to be using CNNs, then the article mentions just using SolvePnP instead for the 3D case (which is not ML -- it's an overdetermined linear system solver). Wouldn't it be possible to map points on the hand into a prototypical hand scale-invariant reference frame in 3D space? We also have newer mobile devices with stereoscopic cameras.

Now I'm also more curious about running CNNs on mobile devices (seems like something that can just be done in a shader).

  • hwoolery 6 months ago

    Apologies if the title is confusing. The demo does use CNNs, as mentioned to predict the 2D joint locations of the hand. The whole point of this system is that it relies only on a monocular camera, allowing it to run on a wide range of devices. CoreML on iOS and Tensorflow Lite on Android handle all the GPU inference on these models