Personal tools
You are here: Home Research Trends & Opportunities New Media and New Digital Economy Computer Vision, Immersive Technology, and Digital Content

Computer Vision, Immersive Technology, and Digital Content

The Lunar Eclipse, October 2014
(15min progression of the Lunar Eclipse, San Francisco/Bay Area, California, October, 2014 - Jeff M. Wang)

Luck is What Happens
When Preparation Meets Opportunity.


- Overview

The COVID-19 pandemic has accelerated the adoption of digital technologies at a faster speed than we could have imagined. As business undergoes change, companies are realizing the power of technology to unify dispersed, global talent. Virtual learning and telehealth are also becoming more advanced and could deliver benefits throughout the world.

- Unlocking the Business Potential of Virtual World

As organizations look to take advantage of virtual worlds and the opportunities they present, they can explore these virtual worlds inside the studio, in the real world, and in the context of their businesses, providing a range of benefits that include: 
  • Working alongside the designers and architects of virtual worlds, learning the new generation of creative tools, and simulation technologies that enable Unlimited Realities. 
  • Creating hyper-realistic, physically-accurate digital twins that simulate natural environments, physical structures, industrial operations, transportation networks, including the humans and robots and AI agents working inside them, to accelerate design and planning cycles for all business paradigms. 
  • Building shared virtual experiences that convene audiences for collaborative work, recreation, or education through AR/VR or mixed reality. 
  • Exploring virtual world economies where transactions in digital currencies and assets will power an explosion of virtual services, experiences, and goods. 
  • Enabling virtual world strategies that maximize positive impact on the planet, advancing client’s environment, social and corporate governance initiatives.


- Digital Content and Technologies

Our world has countless images and videos from the built-in cameras of our mobile devices alone. But while images can include photos and videos, it can also mean data from thermal or infrared sensors and other sources. Along with a tremendous amount of visual data (more than 3 billion images are shared online every day), the computing power required to analyze the data is now accessible and more affordable. 

This is a trivial problem for a human, even young children. We require at least the same capabilities from computers in order to unlock our images and videos.  

  • A person can describe the content of a photograph they have seen once.
  • A person can summarize a video that they have only seen once.
  • A person can recognize a face that they have only seen once before. 

Sharing engaging and immersive visual content such as photos, videos, 360-degree and real-time augmented experiences is at the heart of staying connected and building community. Developing and refining advanced real-time computational photography and image understanding techniques that allow us to enhance our images and video, track and enhance faces, bodies and the 3D world, and capture and share the 3D world with high fidelity.

Research scientists and engineers span a myriad of disciplines including computer vision, computer graphics, computational photography, machine learning, interaction technologies and mobile development to unlock the commercial potential of virtual worlds.


A Magical Night in Istanbul_Turkey_050321A
[A Magical Night in Istanbul, Turkey - Civil Engineering Discoveries]

- Computer Vision Technology

For many decades, people dreamed of creating machines with the characteristics of human intelligence, those that can think and act like humans. One of the most fascinating ideas was to give computers the ability to “see” and interpret the world around them. The fiction of yesterday has become the fact of today. 

Thanks to advancements in artificial intelligence (AI) and computational power, computer vision technology has taken a huge leap toward integration in our daily lives.

Computer vision is the field of computer science that focuses on creating digital systems that can process, analyze, and make sense of visual data (images or videos) in the same way that humans do. The concept of computer vision is based on teaching computers to process an image at a pixel level and understand it. Technically, machines attempt to retrieve visual information, handle it, and interpret results through special software algorithms.

One of the most powerful and compelling types of artificial intelligence (AI) is computer vision which you’ve almost surely experienced in any number of ways without even knowing. Computer vision is a form of AI where computers can “see” the world, analyze visual data and then make decisions from it or gain understanding about the environment and situation. To get the most out of image data, we need computers to “see” an image and understand the content. As a multidisciplinary area of study, it can look messy, with techniques borrowed and reused from a range of disparate engineering and computer science fields.


- Computer Vision, AI, and VR/AR

Augmented reality – A blend of the physical and digital environments; refers to the technology in which data is overlaid on the physical reality using a fusion of sensor data from cameras, accelerometers, etc. Virtual reality -- A computer-generated simulation of a 3D image that enables the person to interact with a digital environment.

The AR/VR field has traditionally leveraged techniques like computer vision (not AI-powered) to advance innovation. But many businesses are discovering that these technologies and AI have a deep, complementary connection. AI excels at many actions that are beneficial to AR/VR: it can track objects, create detailed models of the 3D world, understand what features are in these models, and make judgments about them. 

Deep learning models in AI are particularly useful here, as they can identify vertical and horizontal planes; track an object’s movements and position; and estimate object depths, among other AR/VR synchronicities. Deep learning models can, in other words, help an AR/VR system interpret complex environments. An auto mechanic could, theoretically, use an AI-powered AR system to view a vehicle’s engine and be told by the system which parts need to be fixed and how. 

As a result of these complementary characteristics, AI is starting to replace traditional computer vision methods in AR/VR, with a number of industry leaders projecting that AI will help drive immersive technology adoption in consumer and business segments. Specifically, AI can enhance AR/VR experiences through the application of more realistic models as well as giving people greater ability to interact with the scenes. 

This powerful partnership of AR/VR and AI is due in part to advances in deep learning that apply to 3D model building, increased availability of data and data storage options like the cloud, and increasing levels of computing power. Regardless of the reasons, the integration is expected to provide exciting opportunities across many industries.


- Image Processing, Computer Vision, and Neural Networks

Computer vision is a field of machine learning that focuses on interpreting and understanding images and videos. It is used to help teach computers to "see" and use visual information to perform visual tasks that humans can.

Computer vision models are designed to translate visual data based on features and contextual information identified during training. This enables models to interpret images and videos and apply those interpretations to predictive or decision-making tasks.

While both are related to visual data, image processing is not the same as computer vision. Image processing involves modifying or enhancing images to produce new results. It can include optimizing brightness or contrast, increasing resolution, blurring sensitive information, or cropping. The difference between image processing and computer vision is that the former does not necessarily need to recognize content.

Modern computer vision algorithms are based on convolutional neural networks (CNNs), which offer significant improvements in performance over traditional image processing algorithms. CNNs are neural networks with a multi-layered architecture designed to gradually reduce data and computations to the most relevant set. This collection is then compared to known data to identify or classify data inputs. CNNs are commonly used for computer vision tasks, but can also perform text analysis and audio analysis.



 [More to come ...]

Document Actions