Link Search Menu Expand Document

Topics of Machine Learning for New Media

These are my notes on a “reading course” for my PhD studies at Instituto de Matemática Pura e Aplicada (IMPA).

Course name: Topics of Machine Learning for New Media

Advisor: Luiz Velho

Course period: jan/fev-2021 (summer) - [ongoing]

Course Plan:

The goal of this course is to study the state of the art of machine learning techniques, especially deep learning, in the field of geometric data for applications in new media. We aim to investigate the following topics: geometric modeling with machine learning, learning from synthetically generated data and new platforms that support the construction of data structures, algorithms and architectures of neural networks for learning with geometric data.

Within this new deep learning paradigm, we will analyze aspects of geometric modeling, especially with regard to implicit modeling and the representation of geometric attributes in a continuous volume in space, and also of differentiable rendering using neural networks. One of the challenges for the effective training of several of the machine learning models is the need for labeled data, which must be categorized by humans, therefore increasing the cost of data acquisition and the probability of making mistakes. Recent studies point to the use of synthetic data as a direction for training models that could perform well when applied to real data. In this study, we will seek to understand the advantages and limitations of this method, as well as its connection with virtual simulations.

In addition to studying the literature, we will make experiments using the platforms PyTorch3D and Unity, which have incorporated many of the published techniques in geometric modeling and rendering, as well as advances in training and validation of deep learning models with geometric data.

References

  1. Ahmed, Eman & Saint, Alexandre & Das, Rig & Shabayek, Abdelrahman & Gusev, Gleb & Cherenkova, Kseniya & Aouada, Djamila. (2019). A survey on Deep Learning Advances on Different 3D Data Representations. 10.13140/RG.2.2.32083.02080.
  2. Ravi, Nikhila & Reizenstein, Jeremy & Novotny, David & Gordon, Taylor & Lo, Wan-Yen & Johnson, Justin & Gkioxari, Georgia. (2020). Accelerating 3D Deep Learning with PyTorch3D. https://arxiv.org/pdf/2007.08501.pdf.
  3. Niemeyer, Michael & Mescheder, Lars & Oechsle, Michael & Geiger, Andreas. (2020). Differentiable Volumetric Rendering: Learning Implicit 3D Representations Without 3D Supervision. 3501-3512. 10.1109/CVPR42600.2020.00356.
  4. Mildenhall, Ben & Srinivasan, Pratul & Tancik, Matthew & Barron, Jonathan & Ramamoorthi, Ravi & Ng, Ren. (2020). NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. https://arxiv.org/abs/2003.08934.
  5. Hinterstoisser, S., Pauly, O., Heibel, H., Marek, M., & Bokeloh, M. (2019). An Annotation Saved is an Annotation Earned: Using Fully Synthetic Training for Object Instance Detection.
  6. You-Cyuan Jhang, Adam Palmar, Bowen Li, Saurav Dhakad, Sanjay Kumar Vishwakarma, Jonathan Hogins, Adam Crespi, Chris Kerr, Sharmila Chockalingam, Cesar Romero, Alex Thaman, & Sujoy Ganguly. (2020). Training a performant object detection ML model on synthetic data using Unity Perception tools. .

During the study, we may conclude that additional references are necessary. You’ll find them at the end of each section as we cite.