Zebeth Media Solutions

artificial intelligence

Stability AI backs effort to bring machine learning to biomed • ZebethMedia

Stability AI, the venture-backed startup behind the text-to-image AI system Stable Diffusion, is funding a wide-ranging effort to apply AI to the frontiers of biotech. Called OpenBioML, the endeavor’s first projects will focus on machine learning based approaches to DNA sequencing, protein folding, and computational biochemistry. The company’s founders describe OpenBioML as an “open research laboratory” — aims to explore the intersection of AI and biology in a setting where students, professionals and researchers can participate and collaborate, according to Stability AI CEO Emad Mostaque. “OpenBioML is one of the independent research communities that Stability supports,” Mostaque told ZebethMedia in an email interview. “Stability looks to develop and democratize AI, and through OpenBioML, we see an opportunity to advance the state of the art in sciences, health and medicine.” Given the controversy surrounding Stable Diffusion — Stability AI’s AI system that generates art from text descriptions, similar to OpenAI’s DALL-E 2 — one might be understandably wary of Stability AI’s first venture into health care. The startup has taken a laissez-faire approach to governance, allowing developers to use the system however they wish, including for celebrity deepfakes and pornography. Stability AI’s ethically questionable decisions to date aside, machine learning in medicine is a minefield. While the tech has been successfully applied to diagnose conditions like skin and eye diseases, among others, research has shown that algorithms can develop biases leading to worse care for some patients. An April 2021 study, for example, found that statistical models used to predict suicide risk in mental health patients performed well for white and Asian patients but poorly for Black patients. OpenBioML is starting with safer territory, wisely. Its first projects are: BioLM, which seeks to apply natural language processing (NLP) techniques to the fields of computational biology and chemistry DNA-Diffusion, which aims to develop AI that can generate DNA sequences from text prompts LibreFold, which looks to increase access to AI protein structure prediction systems similar to DeepMind’s AlphaFold 2 Each project is led by independent researchers, but Stability AI is providing support in the form of access to its AWS-hosted cluster of over 5,000 Nvidia A100 GPUs to train the AI systems. According to Niccolò Zanichelli, a computer science undergraduate at the University of Parma and one of the lead researchers at OpenBioML, this will be enough processing power and storage to eventually train up to ten different AlphaFold 2-like systems in parallel. “A lot of computational biology research already leads to open-source releases. However, much of it happens at the level of a single lab and is therefore usually constrained by insufficient computational resources,” Zanichelli told ZebethMedia via email. “We want to change this by encouraging large-scale collaborations and, thanks to the support of Stability AI, back those collaborations with resources that only the largest industrial laboratories have access to.” Generating DNA sequences Of OpenBioML’s ongoing projects, DNA-Diffusion — led by pathology professor Luca Pinello’s lab at the Massachusetts General Hospital & Harvard Medical School — is perhaps the most ambitious. The goal is to use generative AI systems to learn and apply the rules of “regulatory” sequences of DNA, or segments of nucleic acid molecules that influence the expression of specific genes within an organism. Many diseases and disorders are the result of misregulated genes, but science has yet to discover a reliable process for identifying — much less changing — these regulatory sequences. DNA-Diffusion proposes using a type of AI system known as a diffusion model to generate cell-type-specific regulatory DNA sequences. Diffusion models — which underpin image generators like Stable Diffusion and OpenAI’s DALL-E 2 — create new data (e.g. DNA sequences) by learning how to destroy and recover many existing samples of data. As they’re fed the samples, the models get better at recovering all the data they had previously destroyed to generate new works. Image Credits: Stability AI “Diffusion has seen widespread success in multimodal generative models, and it is now starting to be applied to computational biology, for example for the generation of novel protein structures,” Zanichelli said. “With DNA-Diffusion, we’re now exploring its application to genomic sequences.” If all goes according to plan, the DNA-Diffusion project will produce a diffusion model that can generate regulatory DNA sequences from text instructions like “A sequence that will activate a gene to its maximum expression level in cell type X” and “A sequence that activates a gene in liver and heart, but not in brain.” Such a model could also help interpret the components of regulatory sequences, Zanichelli says — improving the scientific community’s understanding of the role of regulatory sequences in different diseases. It’s worth noting that this is largely theoretical. While preliminary research on applying diffusion to protein folding seems promising, it’s very early days, Zanichelli admits — hence the push to involve the wider AI community. Predicting protein structures OpenBioML’s LibreFold, while smaller in scope, is more likely to bear immediate fruit. The project seeks to arrive at a better understanding of machine learning systems that predict protein structures in addition to ways to improve them. As my colleague Devin Coldewey covered in his piece about DeepMind’s work on AlphaFold 2, AI systems that accurately predict protein shape are relatively new on the scene but transformative in terms of their potential. Proteins comprise sequences of amino acids that fold into shapes to accomplish different tasks within living organisms. The process of determining what shape an acids sequence will create was once an arduous, error-prone undertaking. AI systems like AlphaFold 2 changed that; thanks to them, over 98% of protein structures in the human body are known to science today, as well as hundreds of thousands of other structures in organisms like E. coli and yeast. Few groups have the engineering expertise and resources necessary to develop this kind of AI, though. DeepMind spent days training AlphaFold 2 on tensor processing units (TPUs), Google’s costly AI accelerator hardware. And acid sequence training data sets are often proprietary or released under non-commercial licenses. Proteins folding

OpenAI will give roughly 10 AI startups $1M each and early access to its systems • ZebethMedia

OpenAI, the San Francisco-based lab behind AI systems like GPT-3 and DALL-E 2, today launched a new program to provide early-stage AI startups with capital and access to OpenAI tech and resources. Called Converge, the cohort will be financed by the OpenAI Startup Fund, OpenAI says. The $100 million entrepreneurial tranche was announced last May and was backed by Microsoft and other partners. The 10 or so founders chosen for Converge will receive $1 million each and admission to five weeks of office hours, workshops and events with OpenAI staff, as well as early access to OpenAI models and “programming tailored to AI companies.” “We’re excited to meet groups across all phases of the seed stage, from pre-idea solo founders to co-founding teams already working on a product,” OpenAI writes in a blog post shared with ZebethMedia ahead of today’s announcement. “Engineers, designers, researchers, and product builders … from all backgrounds, disciplines, and experience levels are encouraged to apply, and prior experience working with AI systems is not required.” The deadline to apply is November 25, but OpenAI notes that it’ll continue to evaluate applications after that date for future cohorts. When OpenAI first detailed the OpenAI Startup Fund, it said recipients of cash from the fund would receive access to Azure resources from Microsoft. It’s unclear whether the same benefit will be afforded to Converge participants; we’ve asked OpenAI to clarify. We’ve also asked OpenAI to disclose the full terms for Converge, including the equity agreement, and we’ll update this piece once we hear back. Beyond Converge, surprisingly, there aren’t many incubator programs focused exclusively on AI startups. The Allen Institute for AI has a small accelerator that launched in 2017, which provides up to a $500,000 pre-seed investment and up to $450,000 in cloud compute credits. Google Brain founder Andrew Ng heads up the AI Fund, a $175 million tranche to initiate new AI-centered businesses and companies. And Nat Friedman (formerly of GitHub) and Daniel Gross (ex-Apple) fund the AI Grant, which provides up to $250,000 for “AI-native” product startups and $250,000 in cloud credits from Azure. With Converge, OpenAI is no doubt looking to cash in on the increasingly lucrative industry that is AI. The Information reports that OpenAI — which itself is reportedly in talks to raise cash from Microsoft at a nearly $20 billion valuation — has agreed to lead financing of Descript, an AI-powered audio and video editing app, at a valuation of around $550 million. AI startup Cohere is said to be negotiating a $200 million round led by Google, while Stability AI, the company supporting the development of generative AI systems, including Stable Diffusion, recently raised $101 million. The size of the largest AI startup financing rounds doesn’t necessarily correlate with revenue, given the enormous expenses (personnel, compute, etc.) involved in developing state-of-the-art AI systems. (Training Stable Diffusion alone cost around $600,000, according to Stability AI.) But the continued willingness of investors to cut these startups massive checks — see Inflection AI‘s $225 million raise, Anthropic’s $580 million in new funding and so on — suggests that they have confidence in an eventual return on investment.

AI saving whales, steadying gaits and banishing traffic • ZebethMedia

Research in the field of machine learning and AI, now a key technology in practically every industry and company, is far too voluminous for anyone to read it all. This column, Perceptron, aims to collect some of the most relevant recent discoveries and papers — particularly in, but not limited to, artificial intelligence — and explain why they matter. Over the past few weeks, researchers at MIT have detailed their work on a system to track the progression of Parkinson’s patients by continuously monitoring their gait speed. Elsewhere, Whale Safe, a project spearheaded by the Benioff Ocean Science Laboratory and partners, launched buoys equipped with AI-powered sensors in an experiment to prevent ships from striking whales. Other aspects of ecology and academics also saw advances powered by machine learning. The MIT Parkinson’s-tracking effort aims to help clinicians overcome challenges in treating the estimated 10 million people afflicted by the disease globally. Typically, Parkinson’s patients’ motor skills and cognitive functions are evaluated during clinical visits, but these can be skewed by outside factors like tiredness. Add to that fact that commuting to an office is too overwhelming a prospect for many patients, and their situation grows starker. As an alternative, the MIT team proposes an at-home device that gathers data using radio signals reflecting off of a patient’s body as they move around their home. About the size of a Wi-Fi router, the device, which runs all day, uses an algorithm to pick out the signals even when there’s other people moving around the room. In study published in the journal Science Translational Medicine, the MIT researchers showed that their device was able to effectively track Parkinson’s progression and severity across dozens of participants during a pilot study. For instance, they showed that gait speed declined almost twice as fast for people with Parkinson’s compared to those without, and that daily fluctuations in a patient’s walking speed corresponded with how well they were responding to their medication. Moving from healthcare to the plight of whales, the Whale Safe project — whose stated mission is to “utilize best-in-class technology with best-practice conservation strategies to create a solution to reduce risk to whales” — in late September deployed buoys equipped with onboard computers that can record whale sounds using an underwater microphone. An AI system detects the sounds of particular species and relays the results to a researcher, so that the location of the animal — or animals — can be calculated by corroborating the data with water conditions and local records of whale sightings. The whales’ locations are then communicated to nearby ships so they can reroute as necessary. Collisions with ships are a major cause of death for whales — many species of which are endangered. According to research carried out by the nonprofit Friend of the Sea, ship strikes kill more than 20,000 whales every year. That’s destructive to local ecosystems, as whales play a significant role in capturing carbon from the atmosphere. A single great whale can sequester around 33 tons of carbon dioxide on average. Image Credits: Benioff Ocean Science Laboratory Whale Safe currently has buoys deployed in the Santa Barbara Channel near the ports of Los Angeles and Long Beach. In the future, the project aims to install buoys in other American coastal areas including Seattle, Vancouver, and San Diego. Conserving forests is another area where technology is being brought into play. Surveys of forest land from above using lidar are helpful in estimating growth and other metrics, but the data they produce aren’t always easy to read. Point clouds from lidar are just undifferentiated height and distance maps — the forest is one big surface, not a bunch of individual trees. Those tend to have to be tracked by humans on the ground. Purdue researchers have built an algorithm (not quite AI but we’ll allow it this time) that turns a big lump of 3D lidar data into individually segmented trees, allowing not just canopy and growth data to be collected but a good estimate of actual trees. It does this by calculating the most efficient path from a given point to the ground, essentially the reverse of what nutrients would do in a tree. The results are quite accurate (after being checked with an in-person inventory) and could contribute to far better tracking of forests and resources in the future. Self-driving cars are appearing on our streets with more frequency these days, even if they’re still basically just beta tests. As their numbers grow, how should policy makers and civic engineers accommodate them? Carnegie Mellon researchers put together a policy brief that makes a few interesting arguments. Diagram showing how collaborative decision making in which a few cars opt for a longer route actually makes it faster for most. The key difference, they argue, is that autonomous vehicles drive “altruistically,” which is to say they deliberately accommodate other drivers — by, say, always allowing other drivers to merge ahead of them. This type of behavior can be taken advantage of, but at a policy level it should be rewarded, they argue, and AVs should be given access to things like toll roads and HOV and bus lanes, since they won’t use them “selfishly.” They also recommend that planning agencies take a real zoomed-out view when making decisions, involving other transportation types like bikes and scooters and looking at how inter-AV and inter-fleet communication should be required or augmented. You can read the full 23-page report here (PDF). Turning from traffic to translation, Meta this past week announced a new system, Universal Speech Translator, that’s designed to interpret unwritten languages like Hokkien. As an Engadget piece on the system notes, thousands of spoken languages don’t have a written component, posing a problem for most machine learning translation systems, which typically need to convert speech to written words before translating the new language and reverting the text back to speech. To get around the lack of labeled examples of language, Universal Speech Translator converts speech into “acoustic units”

Bumble open sourced its AI that detects unsolicited nudes • ZebethMedia

As part of its larger commitment to combat “cyberflashing,” the dating app Bumble is open sourcing its AI tool that detects unsolicited lewd images. First debuted in 2019, Private Detector (let’s take a moment to let that name sink in) blurs out nudes that are sent through the Bumble app, giving the user on the receiving end the choice of whether to open the image. “Even though the number of users sending lewd images on our apps is luckily a negligible minority — just 0.1% — our scale allows us to collect a best-in-the-industry dataset of both lewd and non-lewd images, tailored to achieve the best possible performances on the task,” the company wrote in a press release. Now available on GitHub, a refined version of the AI is available for commercial use, distribution and modification. Though it’s not exactly cutting-edge technology to develop a model that detects nude images, it’s something that smaller companies probably don’t have the time to develop themselves. So, other dating apps (or any product where people might send dick pics, AKA the entire internet?) could feasibly integrate this technology into their own products, helping shield users from undesired lewd content. Since releasing Private Detector, Bumble has also worked with U.S. legislators to enforce legal consequences for sending unsolicited nudes. “There’s a need to address this issue beyond Bumble’s product ecosystem and engage in a larger conversation about how to address the issue of unsolicited lewd photos — also known as cyberflashing — to make the internet a safer and kinder place for everyone,” Bumble added. When Bumble first introduced this AI, the company claimed it had 98% accuracy.

Carving out conviction around the future of AI with Sarah Guo • ZebethMedia

There’s no better way to show you have high conviction in yourself as an investor than being the biggest LP in your $101 million fund, right? Especially if you name your firm Conviction, as Sarah Guo did after leaving Greylock following a decade of investing for the well-known venture group. Last week, she announced that she raised $101 million for her new fund to back companies that are building artificial intelligence and what she describes as software 3.0. Guo spoke to ZebethMedia’s Equity podcast, co-hosted by Natasha Mascarenhas and Alex Wilhelm, about her inaugural fund and the broader market that she is investing in today. The entire conversation is live now wherever you find podcasts, so take a listen if you haven’t yet. Below we extracted four key excerpts from the interview to discuss further. Guo’s comments were edited lightly for clarity. Think of venture in innings Part of the allure of startups is that when things don’t go wrong, which they often do, you might just find yourself as an early employee of a rocket ship. That counts in VC, too, of course, if you were the first person to back a company like Airtable or see the power of connected fitness. But what happens when you want to disrupt a category that has been around the block a few times? Guo shared her framework around venture innings, and how that plays a role in her new focus areas at Conviction:

Microsoft announces Syntex, a set of automated document and data processing services • ZebethMedia

Two years ago, Microsoft debuted SharePoint Syntex, which leverages AI to automate the capture and classification of data from documents — building on SharePoint’s existing services. Today marks the expansion of the platform into Microsoft Syntex, a set of new products and capabilities including file annotation and data extraction. Syntex reads, tags and indexes document content — whether digital or physical — making it searchable and available within Microsoft 365 apps and helping manage the content lifecycle with security and retention settings. According to Chris McNulty, the director of Microsoft Syntex, driving the launch was customers’ increasing desire to “do more with less,” particularly as a recession looms. A 2021 survey from Dimensional Research found that more than two-thirds of companies leave valuable data untapped, largely because of problems building pipelines to access that data. “Just as business intelligence transformed the way companies use data to drive business decisions, Microsoft Syntex unlocks the value of the massive amount of content that resides within an organization,” McNulty told ZebethMedia in an email interview. “Virtually any industry with large scale content and processes will see benefits from adopting Microsoft Syntex. In particular, we see the greatest alignment with industries that work with a higher volume of technically dense and regulated content – financial services, manufacturing, health care, life sciences, and retail among them.” Syntex offers backup, arc1hiving, analytics and management tools for documents as well as a viewer to add annotations and redactions to files. Containers enable developers to store content in a managed sandbox, while “scenario accelerators” provide workflows for use cases like contract management, accounts payable and so on. “The Syntex content processor lets you build simple rules to trigger the next action, whether it’s a transaction, an alert, a workflow or just filing your content in the right libraries and folders,” McNulty explained. “[Meanwhile,] the advanced viewer adds an annotation and inking layer on top of any content viewable in Microsoft 365. Annotations can be made securely, with different permissions than the underlying content, and also without modifying the underlying content.” McNulty says that customers like TaylorMade are exploring ways to use Syntex for contract management and assembly, standardizing contracts with common clauses around financial terms. The company is also piloting the service to process orders, receipts and other transactional documents for accounts payable and finance teams, in addition to organizing and securing emails, attachments and other documents for intellectual property and patent filings. “One of the fastest-growing content transactions is e-signature,” McNulty said. “[With Syntex, you] can send electronic signature requests using Syntex, Adobe Acrobat Sign, DocuSign or any of our other e-signature partner solutions and your content stays in Microsoft 365 while it’s being reviewed and signed.” Intelligent document processing of the type Syntex does is often touted as a solution to the problem of file management and orchestration at scale. According to one source, 15% of a company’s revenue is spent creating, managing and distributing documents. Documents aren’t just costly — they’re time-wasting and error-prone. More than nine in 10 employees responding to a 2021 ABBY survey said that they waste up to eight hours each week looking through documents to find data, and using traditional methods to create a new document takes on average three hours and incurs six errors in punctuation, spellings, omissions or printing. A number of startups offer products to tackle this, including Hypatos, which applies deep learning to power a wide range of back-office automation with a focus on industries with heavy financial document processing needs. Flatfile automatically learns how imported data from files should be structured and cleaned, while another vendor, Klarity, aims to replace humans for tasks that require large-scale document review, including accounting order forms, purchase orders and agreements. As with many of its services announced today, Microsoft, evidently, is betting scale will work in its favor. “Syntex uses AI and automation technologies from across Microsoft, including summarization, translation and optical character recognition,” McNulty said. “Many of these services are being made available to Microsoft 365 commercial accounts with no additional upfront licensing under a new pay-as-you-go business model.” Syntex is beginning to roll out today and will continue to roll out in early 2023. Microsoft says it’ll have additional details on service pricing and packaging published on the Microsoft 365 message center and through licensing disclosure documentation in the coming months.

The Berlin startup that wants to give Zapier a run for its money • ZebethMedia

Zapier and IFTTT are, today, very large platforms for creating automation rules for texts or getting two apps to “talk” to each other via APIs. However, these are ‘hammers to crack nuts’ when it comes to processing simple tasks needed inside businesses. Furthermore, if you include images or video, or if the text referred to is unstructured, tools that require that structure won’t work so well, if at all. This was the thinking behind the Berlin-based Levity startup. It came up with a way for businesses to create AI-powered, ‘no-Code’ rules for automating tasks in a way that non-technical people can use. It’s now raised $8.3 million in seed funding, co-led by Balderton Capital (out of London) and Chalfen Ventures, as well as a number of Angels. Founded by Gero Keil and Thilo Hüllmann, Levity allows businesses to use simple templates to automate workflows, with, says the firm, an underlining AI which takes care of the heavy lifting. This uses NLP and computer vision in a single horizontal platform to parse unstructured data types – such as images, texts, and documents. Levity’s customers range from fashion and real estate to shipping, marketing, social media, scientific research, and others.  Typical use cases include automatically tagging and routing incoming emails or email attachments; triaging customer support tickets; sorting incoming documents into respective folders; or tagging visual inventory data, such as product photos. A little like Zapier, the platform integrates with Gmail, Outlook, Google Drive, Dropbox,  Airtable, and others. The startup says the system is also SOC2 Type I certified and GDPR compliant. In a statement, Gero Keil, co-founder and CEO of Levity said: “Businesses and their customers deserve the same opportunities to reap the benefits of AI and automation as their bigger rivals.” The platform launched this past August subscription prices start at $200 per month. James Wise, partner at Balderton Capital added: “There is an increasing divide between companies with the means to capitalize on AI and automation, and those smaller businesses who lack the resources to do so.  Levity is on a mission to close this divide.”

Trendsi secures $25M to help sellers and manufacturers predict demand • ZebethMedia

In the traditional business-to-business world, sellers often don’t know how much of a product they should order. Even at well-run companies, anywhere from 20% to 30% of inventory is either dead (i.e. doesn’t sell) or obsolete, according to one source. The impact on profitability can be quite severe. Dead stock costs sellers and manufacturers as much as 11% of their revenue, reports Katana, which develops raw material and bills of material tracking software. Seeking to give sellers greater visibility over product demand, so they can make more informed decisions, Ella Zhang co-founded Trendsi, which connects sellers with suppliers while managing the back-end supply chain for its customer base. After gaining traction during the pandemic as many retail businesses made the risk-reducing pivot to selling goods directly to retail, rather than buying inventory, Trendsi has closed a $25 million Series A round that brings its total capital raised to $30 million. Lightspeed Venture Partners led the tranche, with participation from Basis Set Ventures, Footwork VC, Peterson Ventures, Sierra Ventures, Liquid 2 Ventures and individual investors, including Zoom CEO Eric Yuan and Zola CEO Shan-Lyn Ma. Zhang tells ZebethMedia that the new cash will be put toward investments in data infrastructure, supply chain technology, new merchandise categories and international expansion. “We are building a new platform that lowers the barrier for anyone to start selling online or offline,” Zhang told ZebethMedia in an email interview. “With Trendsi … influencers, creators, and more can sell via social networks without worrying about sourcing products, managing warehouse, packaging and shipping, etc., so that they can focus on what they love: their brand and customers.” Image Credits: Trendsi Zhang came from the venture world, serving as an investment director at Kleiner Perkins after stints at Google, Tencent and Binance (where she founded the startup’s investment arm, Binance Labs). Zhang met Trendsi’s second co-founder, Sherwin Xia, while a postgrad at Stanford, where the two participated in the Stanford Startup Garage incubator. Xia was one of the first employees at e-scooter startup Lime and previously worked as an analyst at a16z (Andreessen Horowitz). Zhang, Xia and Trendsi’s third co-founder, Maddie Davidson, sought with Trendsi to build a service that applies AI and machine learning to streamline tasks like inventory and sales forecasting. Using data collected on the platform and from third parties, Trendsi attempts to predict sales down to the SKU level, so that sellers can reduce excess inventory and ideally prevent out-of-stock issues. Beyond this, the platform taps sales and behavioral data to curate and recommend products to sellers. Recently, Trendsi launched a feature it calls “just-in-time” manufacturing, which aims to help manufacturers quickly restock based on real-time sales data and predictions. “[This] allows retailers to only take minimum and no inventory risk by building our inventory and sales forecasting models and offering the drop-shipping service,” Zhang explained. “The original upfront risk of buying inventory is now shared among retailers, Trendsi platform and the manufacturers.” Despite competition from inventory optimization startups like Flieber, Syrup Tech and Black Crow AI, business has been robust over the two years since Trendsi’s founding, Zhang claims, with new user growth up 10x year-over-year. (She declined to give a figure.) Over the next year, the company plans to expand its work with sellers and manufacturers in industries where it sees strong upward momentum, specifically home decor, accessories and makeup. “For both our suppliers and retailers, especially in fast fashion, overstock means locked-in capital, wastage of storage space, increased inventory holding costs and unnecessary losses,” Zhang said. “This pandemic has revealed the real costs associated with inventory mismanagement. So Trendsi actually gained traction.” San Francisco-based Trendsi currently has 105 full-time employees and expects to hire 15 more by the end of the year. Not all retailers are climbing aboard the AI train. Nearly half of respondents to a KPMG survey cited cybersecurity breaches and possible bias as their top concerns about the technology, while 75% said they believe AI is more “of hype than reality.” But broadly speaking, AI in retail is a burgeoning category, with the vast majority of retailers participating in the survey saying their employees are prepared — and have the skills — for AI adoption. Retail business leaders expect AI will have the biggest impact in customer intelligence, inventory management and chatbots for customer service, creating a virtuous adoption-investment cycle in the coming years.

Subscribe to Zebeth Media Solutions

You may contact us by filling in this form any time you need professional support or have any questions. You can also fill in the form to leave your comments or feedback.

We respect your privacy.
business and solar energy