Zebeth Media Solutions

perceptron

AI that sees with sound, learns to walk, and predicts seismic physics • ZebethMedia

Research in the field of machine learning and AI, now a key technology in practically every industry and company, is far too voluminous for anyone to read it all. This column, Perceptron, aims to collect some of the most relevant recent discoveries and papers — particularly in, but not limited to, artificial intelligence — and explain why they matter. This month, engineers at Meta detailed two recent innovations from the depths of the company’s research labs: an AI system that compresses audio files and an algorithm that can accelerate protein-folding AI performance by 60x. Elsewhere, scientists at MIT revealed that they’re using spatial acoustic information to help machines better envision their environments, simulating how a listener would hear a sound from any point in a room. Meta’s compression work doesn’t exactly reach unexplored territory. Last year, Google announced Lyra, a neural audio codec trained to compress low-bitrate speech. But Meta claims that its system is the first to work for CD-quality, stereo audio, making it useful for commercial applications like voice calls. Image Credits: An architectural drawing of Meta’s AI audio compression model. Using AI, Meta’s compression system, called Encodec, can compress and decompress audio in real time on a single CPU core at rates of around 1.5 kbps to 12 kbps. Compared to MP3, Encodec can achieve a roughly 10x compression rate at 64 kbps without a perceptible loss in quality. The researchers behind Encodec say that human evaluators preferred the quality of audio processed by Encodec versus Lyra-processed audio, suggesting that Encodec could eventually be used to deliver better-quality audio in situations where bandwidth is constrained or at a premium. As for Meta’s protein folding work, it has less immediate commercial potential. But it could lay the groundwork for important scientific research in the field of biology. Protein structures predicted by Meta’s system. Meta says its AI system, ESMFold, predicted the structures of around 600 million proteins from bacteria, viruses and other microbes that haven’t yet been characterized. That’s more than triple the 220 million structures that Alphabet-backed DeepMind managed to predict earlier this year, which covered nearly every protein from known organisms in DNA databases. Meta’s system isn’t as accurate as DeepMind’s. Of the ~600 million proteins it generated, only a third were “high quality.” But it’s 60 times faster at predicting structures, enabling it to scale structure prediction to much larger databases of proteins. Not to give Meta outsize attention, the company’s AI division also this month detailed a system designed to mathematically reason. Researchers at the company say that their “neural problem solver” learned from a data set of successful mathematical proofs to generalize to new, different kinds of problems. Meta isn’t the first to build such a system. OpenAI developed its own, called Lean, that it announced in February. Separately, DeepMind has experimented with systems that can solve challenging mathematical problems in the studies of symmetries and knots. But Meta claims that its neural problem solver was able to solve five times more International Math Olympiad than any previous AI system and bested other systems on widely-used math benchmarks. Meta notes that math-solving AI could benefit the the fields of software verification, cryptography and even aerospace. Turning our attention to MIT’s work, research scientists there developed a machine learning model that can capture how sounds in a room will propagate through the space. By modeling the acoustics, the system can learn a room’s geometry from sound recordings, which can then be used to build visual renderings of a room. The researchers say the tech could be applied to virtual and augmented reality software or robots that have to navigate complex environments. In the future, they plan to enhance the system so that it can generalize to new and larger scenes, such as entire buildings or even whole towns and cities. At Berkeley’s robotics department, two separate teams are accelerating the rate at which a quadrupedal robot can learn to walk and do other tricks. One team looked to combine the best-of-breed work out of numerous other advances in reinforcement learning to allow a robot to go from blank slate to robust walking on uncertain terrain in just 20 minutes real-time. “Perhaps surprisingly, we find that with several careful design decisions in terms of the task setup and algorithm implementation, it is possible for a quadrupedal robot to learn to walk from scratch with deep RL in under 20 minutes, across a range of different environments and surface types. Crucially, this does not require novel algorithmic components or any other unexpected innovation,” write the researchers. Instead, they select and combine some state-of-the-art approaches and get amazing results. You can read the paper here. Robot dog demo from EECS professor Pieter AbbeelÕs lab in Berkeley, Calif. in 2022. (Photo courtesy Philipp Wu/Berkeley Engineering) Another locomotion learning project, from (ZebethMedia’s pal) Pieter Abbeel’s lab, was described as “training an imagination.” They set up the robot with the ability to attempt predictions of how its actions will work out, and though it starts out pretty helpless, it quickly gains more knowledge about the world and how it works. This leads to a better prediction process, which leads to better knowledge, and so on in feedback until it’s walking in under an hour. It learns just as quickly to recover from being pushed or otherwise “purturbed,” as the lingo has it. Their work is documented here. Work with a potentially more immediate application came earlier this month out of Los Alamos National Laboratory, where researchers developed a machine learning technique to predict the friction that occurs during earthquakes — providing a way to forecast earthquakes. Using a language model, the team says that they were able to analyze the statistical features of seismic signals emitted from a fault in a laboratory earthquake machine to project the timing of a next quake. “The model is not constrained with physics, but it predicts the physics, the actual behavior of the system,” said Chris Johnson. one of the research leads on the

AI saving whales, steadying gaits and banishing traffic • ZebethMedia

Research in the field of machine learning and AI, now a key technology in practically every industry and company, is far too voluminous for anyone to read it all. This column, Perceptron, aims to collect some of the most relevant recent discoveries and papers — particularly in, but not limited to, artificial intelligence — and explain why they matter. Over the past few weeks, researchers at MIT have detailed their work on a system to track the progression of Parkinson’s patients by continuously monitoring their gait speed. Elsewhere, Whale Safe, a project spearheaded by the Benioff Ocean Science Laboratory and partners, launched buoys equipped with AI-powered sensors in an experiment to prevent ships from striking whales. Other aspects of ecology and academics also saw advances powered by machine learning. The MIT Parkinson’s-tracking effort aims to help clinicians overcome challenges in treating the estimated 10 million people afflicted by the disease globally. Typically, Parkinson’s patients’ motor skills and cognitive functions are evaluated during clinical visits, but these can be skewed by outside factors like tiredness. Add to that fact that commuting to an office is too overwhelming a prospect for many patients, and their situation grows starker. As an alternative, the MIT team proposes an at-home device that gathers data using radio signals reflecting off of a patient’s body as they move around their home. About the size of a Wi-Fi router, the device, which runs all day, uses an algorithm to pick out the signals even when there’s other people moving around the room. In study published in the journal Science Translational Medicine, the MIT researchers showed that their device was able to effectively track Parkinson’s progression and severity across dozens of participants during a pilot study. For instance, they showed that gait speed declined almost twice as fast for people with Parkinson’s compared to those without, and that daily fluctuations in a patient’s walking speed corresponded with how well they were responding to their medication. Moving from healthcare to the plight of whales, the Whale Safe project — whose stated mission is to “utilize best-in-class technology with best-practice conservation strategies to create a solution to reduce risk to whales” — in late September deployed buoys equipped with onboard computers that can record whale sounds using an underwater microphone. An AI system detects the sounds of particular species and relays the results to a researcher, so that the location of the animal — or animals — can be calculated by corroborating the data with water conditions and local records of whale sightings. The whales’ locations are then communicated to nearby ships so they can reroute as necessary. Collisions with ships are a major cause of death for whales — many species of which are endangered. According to research carried out by the nonprofit Friend of the Sea, ship strikes kill more than 20,000 whales every year. That’s destructive to local ecosystems, as whales play a significant role in capturing carbon from the atmosphere. A single great whale can sequester around 33 tons of carbon dioxide on average. Image Credits: Benioff Ocean Science Laboratory Whale Safe currently has buoys deployed in the Santa Barbara Channel near the ports of Los Angeles and Long Beach. In the future, the project aims to install buoys in other American coastal areas including Seattle, Vancouver, and San Diego. Conserving forests is another area where technology is being brought into play. Surveys of forest land from above using lidar are helpful in estimating growth and other metrics, but the data they produce aren’t always easy to read. Point clouds from lidar are just undifferentiated height and distance maps — the forest is one big surface, not a bunch of individual trees. Those tend to have to be tracked by humans on the ground. Purdue researchers have built an algorithm (not quite AI but we’ll allow it this time) that turns a big lump of 3D lidar data into individually segmented trees, allowing not just canopy and growth data to be collected but a good estimate of actual trees. It does this by calculating the most efficient path from a given point to the ground, essentially the reverse of what nutrients would do in a tree. The results are quite accurate (after being checked with an in-person inventory) and could contribute to far better tracking of forests and resources in the future. Self-driving cars are appearing on our streets with more frequency these days, even if they’re still basically just beta tests. As their numbers grow, how should policy makers and civic engineers accommodate them? Carnegie Mellon researchers put together a policy brief that makes a few interesting arguments. Diagram showing how collaborative decision making in which a few cars opt for a longer route actually makes it faster for most. The key difference, they argue, is that autonomous vehicles drive “altruistically,” which is to say they deliberately accommodate other drivers — by, say, always allowing other drivers to merge ahead of them. This type of behavior can be taken advantage of, but at a policy level it should be rewarded, they argue, and AVs should be given access to things like toll roads and HOV and bus lanes, since they won’t use them “selfishly.” They also recommend that planning agencies take a real zoomed-out view when making decisions, involving other transportation types like bikes and scooters and looking at how inter-AV and inter-fleet communication should be required or augmented. You can read the full 23-page report here (PDF). Turning from traffic to translation, Meta this past week announced a new system, Universal Speech Translator, that’s designed to interpret unwritten languages like Hokkien. As an Engadget piece on the system notes, thousands of spoken languages don’t have a written component, posing a problem for most machine learning translation systems, which typically need to convert speech to written words before translating the new language and reverting the text back to speech. To get around the lack of labeled examples of language, Universal Speech Translator converts speech into “acoustic units”

Subscribe to Zebeth Media Solutions

You may contact us by filling in this form any time you need professional support or have any questions. You can also fill in the form to leave your comments or feedback.

We respect your privacy.
business and solar energy