Job Recruitment Website - Zhaopincom - What are the similarities and differences between Magic Leap that shocked the world and Hololens of Microsoft?
What are the similarities and differences between Magic Leap that shocked the world and Hololens of Microsoft?
1, 20 1410, Magic Leap recruited 500 million in September of 2014, and opened an Info link called "The World is Your New Desktop". Gary Bradski, senior vice president of perception research at Magic Leap, and Jean-Yves Bouguet, technical director of computational vision, delivered speeches at that time. Gary is a leader in computer vision. He founded OpenCV (Computational Vision Tool Library) in Willow Garage and is also a consultant professor at Stanford University. Jean-Yves was originally responsible for the manufacture of Google Street View cars at Google, and he is a big cow of computational vision technology. It is very shocking that they joined Magic Leap. I attended this information meeting. At that time, Gary introduced the technology of Magic Leap in the sensing part, and briefly introduced the principle of the legendary digital light field film reality, and took photos in the part where video recording was allowed. Most of the dry goods in this article come from this speech.
2. At the beginning of this year, I attended a class taught by Gordon Wetzstein, a professor of computational photography and digital light field display at Stanford: EE 367 computational imaging and display: computational lighting. In the fourth week, both wearable display and display block (light field display) talked about the principle of Magic Leap. Now you can also see these materials on this course website, ee 367/cs 448 I: computational imaging and display.
By the way, introduce Gordon's Stanford computational graphics group. Marc Levoy (Professor Daniel who later made Google Glass) has been devoted to the study of light field. From Marc Levoy's light field camera, to his student Ren Ng's founding Lytro Company, and then to the manufacture of light field display (naked-eye 3D display), this group has always been the world leader in light field research. Magic Leap may become the biggest application of light field display. (For related contents, please refer to: Overview of Computational Imaging Research).
3. I participated in the workshop of light field imaging and the seminar of light field imaging technology this year. There are many light field technologies on display at the scene, and I exchanged my views on Magic Leap with many light field display technologies. In particular, we experienced Demo, a light field technology close to Magic Leap, and Douglas Lanman of Nvidia's near-field display. (For related contents, please refer to the near-field display. )
In the middle of this year, I visited Redmond, Microsoft Research Institute. Richard Szeliski, the principal researcher of the institute, asked us to try Hololens. I feel the unparalleled positioning perception technology of Hololens. There is a confidentiality agreement, and this article does not provide details, but only provides a principled comparison with Magic Leap.
Here are the dry goods:
First of all, the popular science about Magic Leap, Hololens and other ar glasses is to let you see the images of objects that don't exist in the real world and interact with them. Technically, it can be simply considered as two parts:
Perception); Part of the real world.
The head-mounted display presents a virtual display.
I will explain the related technologies of Magic Leap in the perception part and the display part respectively.
First, the display section
Simply answer this question first:
Q 1。 What's the difference between HoloLens and Magic Leap? What is the essential principle of magic leap?
In the sensing part, Hololens and Magic Leap are not much different in technical direction, and they are both spatial sensing positioning technologies. This article will focus on later. The biggest difference between Magic Leap and Hololens should come from the display part. Magic Leap uses optical fiber to project the entire digital light field directly onto the retina to create the so-called movie reality. Hololens uses translucent glass, and the DLP projection on the side shows that the virtual object is always real, similar to the Espon glasses display or Google Glass solution on the market. It is a two-dimensional display, and the immersion will be discounted at a small viewing angle of about 40 degrees.
The essential physical principle is that the propagation of light in free space can be uniquely represented by a four-dimensional light field. Each pixel of the imaging plane contains information of light in all directions of the pixel. For the imaging plane, the direction is two-dimensional, so the light field is four-dimensional. Usually, the imaging process is only the two-dimensional integration of four-dimensional light field (the information of light in all directions on each pixel is superimposed on one pixel), but the traditional display shows this two-dimensional image, and there is the loss of other two-dimensional information. Magic Leap directly projects the whole four-dimensional light field onto your retina, so there is no mathematical difference between the objects people see through Magic Leap and the real objects, and there is no information loss. Theoretically, with Magic Leap's equipment, you can't distinguish between virtual objects and real objects.
The most obvious difference between devices using Magic Leap and other technologies is that the human eye can directly choose focusing (active selective focusing). For example, I want to look at the near objects. The near objects are real and the far objects are empty. Note: this does not require any eye tracking technology, because the projected light field restores all information, so users can directly see what is real to the human eye, just like real objects. For example, in a virtual solar system video of about 27 seconds (such as the gif picture below), the camera is out of focus and then aligned again. This process only happens in the lens, and has nothing to do with Magic Leap's equipment. In other words, the virtual object is there, and what you think is the observer's own business. This is the awesome place of Magic Leap, so Magic Leap calls its effect movie reality.
Q2。 What are the benefits of active selective focusing? In the traditional virtual display technology, why do you feel dizzy? How does Magic Leap solve this problem?
As we all know, the human eye perceives the depth mainly through the triangulation clues between the two eyes and the observed object to perceive the distance between the observed object and the observer. However, triangulation is not the only clue of human perception of depth, and the human brain also integrates another important clue of depth perception: sharpness or focusing clue caused by human eye focusing. However, the objects in traditional binocular virtual display technology (such as Oculus Rift or Hololens) are neither virtual nor real. For example, as shown below, when you see a castle in the distance, the virtual cat nearby should be empty, but in the traditional display technology, the cat is still real, so your brain will be confused and think that the cat is a very big object in the distance. But this is not consistent with your binocular positioning results. After millions of years of evolution, brain programs sometimes think that cats are nearby and sometimes think that cats are far away. Going back and forth will burn your brain, so you will vomit. Magic Leap projects the whole light field, so you can actively and selectively focus. This virtual cat is nearby, and it is real when you look at it. When you look at the castle, it is virtual, just like the real situation, and you won't get dizzy. In the speech, Gary teased Jean-Yves, a guy who vomited after taking 10 minutes with Oculus, but now he won't faint after taking 16 hours with Magic Leap one day.
Supplement: Some people ask why online virtual reality is dizzy because of insufficient frame rate.
Although frame rate and delay are the main problems at present, they are not too big problems, nor are they the decisive factors leading to dizziness. These problems can be solved well with faster graphics card, good IMU and good screen, and head motion prediction algorithm. We should be concerned about some essential dizziness problems.
This is the difference between virtual reality and augmented reality.
In virtual reality, users can't see the real world, and dizziness is often caused by the mismatch between the movement felt by the semicircular canal of the inner ear and the movement seen by vision. So virtual reality games often have the feeling of motion sickness and vomiting. The solution to this problem cannot be solved by a single device. If the user really sits still, if the image is moving at high speed, what equipment can fool your inner ear semicircular canal? Some solutions on the market, such as Omni VR, or VR system with tracking like HTC Vive, actually let you walk to solve this mismatch problem, but such a system is limited by the venue. However, the application of VOID makes good use of the limitations of VR. You don't have to run and jump, but you can make a big scene with a small space and make you feel like you are in a big scene. Nowadays, most virtual reality experiences or panoramic movies will move the perspective at a relatively slow speed, otherwise you will vomit.
However, Magic Leap is AR augmented reality, because it can see the real world, so there is no problem of perception mismatch of semicircular canal in inner ear. For AR, the main challenge is to solve the problem of sharpness change between the projected object and the real object. So the solution given by Magic Leap solves this problem well. But they are all theoretical, and the actual engineering ability will be proved by time.
Q3。 Why is there a head-mounted monitor? Why can't naked-eye holography? How is Magic Leap realized?
For hundreds of years, humans have been thinking about seeing a virtual object out of thin air. There are also many holographic images in the air in various sci-fi movies.
But in fact, considering the essence, it is very difficult to do in physics: there is no medium that can reflect or refract light in pure air. The most important thing to show is the media. There are many rumors on WeChat that Magic Leap doesn't need glasses. I guess it's caused by a translation error. Shoot directly through Magic Leap Tech. Written in the video, many articles have been wrongly translated into "direct vision" or "naked-eye holography". In fact, the video was shot by a camera through the technology of Magic Leap.
At present, holography basically stays in the era of holographic film (as shown below, the small Buddha statue of holographic film I saw at the light field seminar), or the pseudo-holography of special glass made by projection array in Hatsune Hatsune concert (only showing the image from a certain angle and ignoring the light from other angles).
What Magic Leap wants to achieve is the vision of turning the whole world into your desktop. Therefore, instead of making a 3D holographic transparent screen as a medium like Hatsune Miku, or making a holographic film all over the world, it is easier to directly start with the human eyes and directly put the whole light field in front of us. In fact, Invista is also making this kind of light field glasses.
NVIDIA's method is to add a microlens array in front of a two-dimensional display to generate a four-dimensional light field. It is equivalent to mapping 2-dimensional pixels to 4-dimensional pixels, and the natural resolution will not be high, so the resolution of this light field display or camera (Lytro) will not be high either. I personally tested it, and the effect is basically to see the patterns of mosaic painting style.
Magic Leap uses a completely different method to display the light field, which uses optical fiber projection. However, the optical fiber projection method used by Magic Leap is nothing new. Brian Schowengerdt works as a fiber projector in Magic Leap. His tutor is Eric Seibel, a professor from Washington University. Engaged in ultra-high resolution optical fiber endoscope for 8 years. The simple principle is that the optical fiber bundle rotates at high speed in the pipe with the diameter of 1mm, and then it can be scanned in a wide range. The cleverness of the founder of Magic Leap lies in finding these high-resolution optical fiber scanners. Because of the reversibility of light, a high-resolution projector can be made in turn. As shown in the figure, in their paper six years ago, 1mm wide and 9mm long optical fiber can project a few inches of high-definition butterfly images. Now the technology is estimated to have surpassed that time.
However, such an optical fiber high-resolution projector cannot restore the light field, and a microlens array needs to be placed at the other end of the optical fiber to generate a 4-dimensional light field. You wonder if this is the same as NVIDIA's method? No, because the optical fiber bundle is scanning and rotating, this microlens array does not need to be very dense and large, just display the scanning area. It is equivalent to distributing a large amount of data on the time axis, which is the same as time sharing in communication, because it is difficult for human eyes to distinguish the changes in the 100 frame. As long as the scanning frame rate is high enough, human eyes can't tell whether the display is rotating. Therefore, Magic Leap's equipment can be small and the resolution can be high.
He personally came to Stanford University and gave a speech, showing a large-capacity 3D display with scanning light. This should be about the early prototype of Magic Leap. (For related contents, please refer to the optical fiber scanning display. )
Second, the perception part.
Q4。 First of all, why does augmented reality have a perceptual part?
It is precisely because the device needs to know its position in the real world (positioning) and the three-dimensional structure of the real world (map construction) that the virtual object can be placed in the correct position in the display. Take the recent Magic Leap demo video as an example. For example, there is a virtual solar energy system on the table, which stays in place when the wearer's head moves. This requires the device to know the exact position and direction of the viewer's perspective in real time, so as to calculate the position where the image should be displayed. At the same time, you can see the reflection of sunlight on the desktop, which requires the equipment to know the three-dimensional structure and surface information of the table, so as to correctly project the superimposed image on the image layer of the table. The difficulty lies in how to calculate the whole sensing part in real time, so that the device wearer does not feel the delay. If there is a delay in positioning, the wearer will feel dizzy and the virtual object will appear fake when it drifts on the screen. The so-called cinematic truth claimed by Magic Leap is meaningless.
The three-dimensional perception part is not new. SLAM (Synchronous Positioning and Mapping) in computer vision or robotics has been done for 30 years. Through the fusion of various sensors (lidar, optical camera, depth camera, inertial sensor), the equipment will get its accurate position in three-dimensional space, and at the same time, it can reconstruct the surrounding three-dimensional space in real time.
SLAM technology is particularly popular recently. In the last two years and this year, giants and venture capitalists have acquired and laid out companies with more spatial positioning technologies. Because the three most powerful technological trends: unmanned vehicles, virtual reality and drones, are inseparable from spatial positioning. SLAM is the foundation to complete these great projects. I also study SLAM technology, so I have a lot of contact. In order to facilitate your understanding of this field, here are some major events and figures in the recent Grand Slam:
Sebastian thrun, a robot professor at Stanford University, is a pioneer of modern SLAM technology. After winning the DARPA Challenge, he went to Google to make driverless cars. Most research schools in SLAM academic circle are disciples and grandchildren of Sebastian.
2. (Unmanned Vehicle) Uber obtained NREC (National Robot Engineering R&D Center) from CMU Carnegie Mellon University this year, and jointly established ATC (Advanced Technology R&D Center). These researchers who used to be engaged in the positioning technology of the rover all went to Uber Air Traffic Control Bureau to be unmanned vehicles.
3. (Virtual Reality) Surreal Vision was recently acquired by Oculus Rift, and its founder Richard Newcombe was the inventor of the famous DTAM Kinect Fusion (the core technology of holo lens). Oculus Rift also acquired13rd Labs (a company that does SLAM on mobile phones) last year.
4. (Virtual Reality) Google Project Tango released the world's first commercial tablet with SLAM function this year. Apple acquired Metaio AR in May, and Metaio AR's SLAM has been used in AR's app for a long time. Intel released Real Sense, a depth camera that can demonstrate SLAM, and demonstrated the automatic obstacle function and automatic line inspection function of the drone at CES.
5. (UAV) Skydio, founded by Adam Bry, a student of Nicholas Roy, the founder of Google X Project Wing UAV, received an investment from A 16z with a valuation of 20 million, and hired Frank Dellaert, a SLAM Daniel professor from Georgia Institute of Technology, as their chief scientist. (Related content: http://www.cc.gatech.edu/~ Delaret/Frank Delaret/Frank Delaret/Frank dellaert.html)
SLAM as a basic technology, in fact, the number of people who do SLAM or sensor fusion well in the world may not exceed 100, and most of them know it. So many companies rob so many people, and the competition is fierce. Therefore, Magic Leap, as a startup company, must integrate a lot of capital to compete with large companies for talent resources.
Q5。 What is the perceptual part of Magic Leap?
This photo shows the technical structure and route of Professor Gary in the perception part of Magic Leap at the Stanford Job Fair. It can be seen that four different computer vision technology stack are developed centering on calibration.
1. From the picture, the core step of the whole Magic Leap sensing part is calibration (image or sensor calibration), because active positioning devices such as Magic Leap or Hololens have various cameras and sensors for positioning, and the calibration of camera parameters and relationship parameters between cameras is the first step to start all the work. If the camera and sensor parameters are not accurate in this step, the later positioning is nonsense. Anyone who has engaged in computer vision technology knows that the traditional calibration part needs to spend a lot of time, and it needs to shoot the chessboard with a camera and collect data repeatedly for calibration. But Gary of Magic Leap, who invented a new calibration method, directly used a strange-shaped structure as a corrector, and the camera completed calibration at one time, which was extremely fast. Photographing is not allowed in this part of the scene.
2. With the calibration part, the most important three-dimensional perception and positioning part (technology stack in the lower left corner) is started, which is divided into four steps.
2. 1 The first is plane surface tracking. In the demonstration of the virtual solar system, we can see that the virtual sun reflects light on the table, and this reflection will change its position with the movement of the equipment wearer, just like the sun really shines in the air, and it will be reflected to the surface of the table. This requires the equipment to know where the table surface is in real time, and calculate the relationship between the virtual sun and the plane, so as to calculate the position of the sun reflection and superimpose it on the corresponding seat of the glasses of the equipment wearer, and the depth information is correct. The difficulty lies in the real-time detection of the plane and the smoothness of the given plane position (otherwise the reflection will jump). From the demonstration, we can see that Magic Leap has done a good job in this step.
2.2 then sparse SLAM (sparse slam); Gary showed their real-time 3D reconstruction and location algorithm on Info Session. For the real-time performance of the algorithm, they first implemented a high-speed sparse or semi-sparse 3D positioning algorithm. From the effect point of view, it is not much different from the current open source LSD algorithm.
2.3 followed by sensors; Vision and IMU (fusion of vision and inertial sensors).
Missiles usually use pure inertial sensors for active positioning, but the same method can't be used for civilian-grade low-precision inertial sensors, and will definitely drift after secondary integration. However, the processing speed of the visual part is not high, which is easy to be blocked and the positioning robustness is not high. Integrating vision and inertial sensors is a very popular practice in recent years.
For example:
Google Tango is the integration of IMU and depth camera in this respect, which is very good; DJI's UAV Phantom 3 or Inspire 1 combines the optical monocular camera with the inertial sensor in the UAV, which can achieve amazing stable hovering without GPS; Hololens can be said to have done a good job on SLAM, and specially customized a chip to do SLAM. The algorithm is said to come down in one continuous line with the core of KinectFusion, and the positioning effect is very good through personal test (I can stand and jump in the face of a white featureless wall, but after returning to the center of the venue, the positioning is still very accurate and does not drift at all. )
2.4 Finally, 3D mapping and Dense SLAM(3D mapping reconstruction). The following picture shows the 3D map reconstruction of Magic Leap Mountain View Office: As long as you walk around with the equipment, the 3D map of the whole office will be restored, and there are exquisite maps. The books on the shelf can be reconstructed without deformation.
Because the interaction of AR is a brand-new field, in order to make people interact with the virtual world smoothly, the recognition and tracking algorithm based on machine vision has become the most important. The new human-computer interaction experience needs a lot of technical reserves to support it.
Gary didn't elaborate on the next three branches, but you can see their layout. I just added some notes to help you understand.
3. 1 crowdsourcing. It is used to collect data for future machine learning. It is necessary to build a reasonable feedback learning mechanism and collect data dynamically and incrementally.
3.2 Machine learning & deep learning Machine learning and deep learning. It is necessary to build a machine learning algorithm architecture for the production of subsequent recognition algorithms.
3.3 scene object recognition scene object recognition. Identify objects in the scene, distinguish the types and characteristics of objects, and make better interaction. For example, recognize the puppy when you see it, and then the system can turn the dog P into a dog monster and directly fight the monster.
3.4 behavior recognition behavior recognition. Identifying the behaviors of people or things in the scene, such as running or jumping, walking or sitting, can be used for more dynamic game interaction. By the way, a company called Green Deep Eye run by Stanford alumni in China is also doing research in this field.
Tracking aspect
4. 1 gesture recognition. Used for interaction, in fact, every AR/VR company is making technical reserves in this area.
4.2 target tracking target tracking. This technology is very important. For example, Magic Leap holds a Demo of an elephant. At least you have to know the three-dimensional position information of your hand and track it in real time before you can put the elephant in the right position.
4.3 3D scanning 3D scanning. Can be a virtual reality object. For example, if you pick up a work of art and scan it in three dimensions, users far away can share and play the same object in the virtual world.
4.4 Human body tracking Human body tracking. For example, you can add a blood stick and ability point to each character in reality.
5. 1 eye tracking eye tracking. Gary explained that although the rendering of Magic Leap doesn't need eye tracking, the rendering calculation of Magic Leap is huge due to the calculation of 4-dimensional light field. If eye tracking is done, it can reduce the pressure of object rendering and scene rendering of 3D engine, which is an excellent optimization strategy.
5.2 Emotional recognition Emotional recognition. If Magic Leap wants to be the artificial intelligence operating system depicted in her movie, it can recognize the owner's emotions and make a caring emotional escort effect.
5.3 Biometrics Biometrics. For example, identify people in real scenes and display a name on each person's head. Face recognition is one of them, and Face++, a company run by Yao Ban Brothers in Tsinghua, China, has done the best in this respect.
Summary: Simply put, this part of Magic Leap is similar to many other companies. Although Gary has joined us, he has great ambitions, but the competition in this part is very fierce.
Q6: Even if Magic Leap has solved the problem of perception and display, what is the next difficulty?
1, computing equipment and amount of computation
Magic Leap has to calculate 4-dimensional light field, and the calculation amount is amazing. I don't know how Magic Leap solves it now. What if NVIDIA doesn't have a great mobile graphics card? Can you build your own dedicated circuit? Taking four Titan X's on the road is no joke.
The picture below shows that I participated in one of the VR demonstrations of SIGGraph 20 15 this year. Everyone was playing VR with a big computer bag on their backs. 10 years later, will humans feel funny when they see today's human pursuit of VR? Ha ha.
2, battery! Batteries! Batteries! The pain of all electronic devices
3. Operating system
To be honest, if "the world is your new desktop" is their vision, then there is really no operating system that can support the interaction under the Magic Leap vision. They must invent the wheel themselves.
4. Add physical feelings to the interactive experience of virtual objects.
In order to have a sense of touch, interactive gloves and interactive handles are now hot topics in the VR world. Judging from the current patents, we can't see that Magic Leap will have higher insight. Maybe a Kickstarter can finally dominate, and Magic Leap will take him back.
- Related articles
- What about Foxconn in Yantai, Shandong? Detail point
- How about Changhe q35 gearbox?
- How much is an illustrator's painting-how much does a freelance artist usually earn, how much is a painting and how much is a group of works?
- Treatment of Customs Civil Servants at Tianhe Airport
- The difference between outstanding talents and ordinary talents in Ping An Insurance.
- Only by insisting on auditioning and filming information films with me for a long time, so that they have your information every time they come, will you have a better chance of being seen and selecte
- How to find a part-time tour group
- What about Nanyang Xinda stone co., ltd?
- Where can I find the written test results of teacher recruitment in Chifeng City, Inner Mongolia?
- Is it easy to be a shopping guide in loris?