Machine Vision

News/Machine Vision

Nov 22, 2024

ChatGPT may soon get a ‘Live Camera’ feature — here’s what we know

The imminent addition of real-time video capabilities to ChatGPT marks a significant expansion of the AI chatbot's sensory capabilities, moving beyond text to include visual understanding and analysis. Latest developments: OpenAI's ChatGPT mobile app beta code reveals plans for an upcoming 'Live Camera' feature that will integrate with its Advanced Voice Mode. Code snippets in version 1.2024.317 indicate functionality for real-time video processing and visual recognition The feature will enable users to engage with ChatGPT while using their device's camera for live visual feedback Integration with Advanced Voice Mode suggests a multi-modal interaction combining visual and audio inputs Technical capabilities:...

read Nov 20, 2024

Toronto researchers use AI to track light in motion

AI researchers at the University of Toronto have achieved a groundbreaking advancement in computational imaging that allows for the visualization of light's movement through space, captured at speeds one million times faster than a bullet. The breakthrough technology: University of Toronto computer scientists have developed an innovative camera setup and AI algorithm that can capture and visualize ultrafast light movements from any perspective, dubbing the technology "Flying with Photons." The system can track light as it speeds through objects like pop bottles or bounces off mirrors, creating dramatic slow-motion visualizations Researchers developed a sophisticated AI algorithm that can generate videos...

read Nov 19, 2024

MIT researchers train robotic dog to do parkour

The intersection of artificial intelligence, robotics, and simulated learning environments has reached a new milestone with MIT CSAIL's development of LucidSim, a system that trains robots using AI-generated virtual environments rather than real-world data. Breakthrough innovation: LucidSim represents a significant advancement in robot training by combining generative AI and physics simulators to create diverse, realistic virtual environments for machine learning. The system leverages large language models to generate detailed environment descriptions, which are then converted into images using generative AI technology A sophisticated physics simulator ensures the generated environments accurately reflect real-world physical properties and constraints This novel approach eliminates...

read Nov 19, 2024

AI cameras are catching distracted drivers in the UK

The advancement of AI-powered traffic surveillance technology is revealing concerning patterns of unsafe driving behavior in Greater Manchester, where thousands of drivers have been caught violating basic safety laws in a recent trial period. Project overview: Greater Manchester has implemented a new AI-powered 'Heads Up' camera system to detect and document drivers using mobile phones and failing to wear seat belts. The system combines artificial intelligence with human review to identify traffic safety violations Over a five-week period in September and October 2023, the cameras documented more than 3,200 infractions The trial is part of Greater Manchester's Vision Zero Strategy,...

read Nov 17, 2024

Moondream secures $4.5M to develop compact yet powerful AI models

Moondream's revolutionary approach to AI: Moondream, a startup emerging from stealth mode, has secured $4.5 million in pre-seed funding to challenge the notion that bigger is always better in AI models. The big picture: Moondream's vision-language model operates with just 1.6 billion parameters yet rivals the performance of models four times its size, potentially disrupting the AI industry's focus on large-scale models. The company's open-source model has already gained significant traction, with over 2 million downloads and 5,100 GitHub stars. Moondream's approach allows AI models to run locally on devices, from smartphones to industrial equipment, addressing concerns about cloud computing...

read Nov 16, 2024

Apple’s Visual Intelligence impresses in early tests

Artificial intelligence-powered visual recognition technology takes a significant leap forward with Apple's Visual Intelligence beta release, showcasing promising capabilities in object identification and information retrieval. Initial capabilities and core functions: Apple's Visual Intelligence beta demonstrates several built-in features while leveraging external AI services for broader functionality. The system can directly summarize text from images, extract business information from Apple Maps, and identify dates and times for Calendar integration For object recognition tasks, the beta currently relies on integration with ChatGPT and Google search capabilities Google's visual search functionality has shown particularly impressive results in early testing Real-world performance: Early testing...

read Nov 15, 2024

AI satellites advance wildfire detection in California

The growing threat of wildfires in California has sparked innovative technological solutions that combine artificial intelligence with satellite technology to detect and combat blazes more effectively. Project Overview: The FireSat initiative plans to deploy 52 advanced satellites equipped with AI-powered imaging systems to revolutionize wildfire detection and response capabilities. The Earth Fire Alliance, a San Francisco-based nonprofit, is spearheading the project with $13 million in initial funding from Google.org and the Gordon and Betty Moore Foundation Mountain View-based Muon Space has been contracted to construct the first four satellites in the constellation The first satellite is scheduled for deployment via...

read Nov 13, 2024

This AI shopping assistant combines voice and vision for better recommendations

The intersection of artificial intelligence and augmented reality is creating new possibilities for intuitive shopping experiences, as demonstrated by an innovative product recommendation system showcased at the IEEE Global Tech Forum. Project overview: A new AI-powered furniture shopping assistant, developed by Richa Gupta and Alexander Htet Kyaw, combines visual recognition and voice interaction to help users make better purchasing decisions. The system was presented at an MIT AI conference and earned recognition during AI build week The technology addresses common challenges in furniture shopping, including decision fatigue and difficulty in articulating specific needs The solution integrates augmented reality with contextual...

read Nov 12, 2024

Google Street View’s powerful AI promises to revolutionize how cities manage urban forests

The rapid advancement of artificial intelligence combined with Google Street View imagery is transforming urban forestry management through the creation of detailed digital replicas of city trees. Revolutionary mapping achievement: AI technology has enabled the creation of "digital twins" for approximately 600,000 trees across North American cities, representing a significant breakthrough in urban forest management. These digital replicas capture detailed information about each tree's structure down to individual limbs and branches The technology leverages existing Google Street View imagery to create comprehensive 3D models of urban trees The system spans multiple cities across North America, creating one of the largest...

read Nov 12, 2024

MIT researchers train robot dog to navigate new environments with AI

A groundbreaking AI system has demonstrated the ability to train robots in virtual environments with unprecedented success rates when transferring skills to real-world scenarios, potentially transforming how robots learn to navigate and interact with their surroundings. The innovation breakthrough: MIT researchers have developed LucidSim, a system that combines generative AI with physics simulators to create more realistic virtual training environments for robots. The system successfully trained a robot dog to perform parkour-like maneuvers without any real-world training data LucidSim uses ChatGPT to generate thousands of detailed environmental descriptions, which are then converted into 3D-mapped training scenarios The approach bridges the...

read Nov 11, 2024

Researchers use AI to create 3D model from a 134 year old photo

The emergence of sophisticated AI technology is enabling researchers to reconstruct ancient artifacts from historical photographs, offering new possibilities for archaeological preservation and cultural heritage studies. Breakthrough Technology: A research team at Ritsumeikan University has developed an innovative neural network capable of creating detailed 3D models from single 2D photographs. Led by Professor Satoshi Tanaka and Dr. Jiao Pan, the team successfully reconstructed an ancient stone relief from the Borobudur Temple in Indonesia The relief, which depicts people in traditional attire against a backdrop of trees and architecture, was photographed in black and white 134 years ago before being covered...

read Nov 11, 2024

San Antonio firm builds AI gunshot detection for schools

School safety and security concerns in the United States have prompted innovative technological solutions, with Texas-based Wytec International Inc. emerging as a key player in developing AI-powered threat detection systems for educational institutions. Technology overview: Wytec International has developed an artificial intelligence system that integrates with sensors and cameras to detect multiple types of security threats in school environments. The AI software processes data from a network of sensors to identify gunshots with claimed 90% accuracy in laboratory testing The system can also detect other hazards including smoke, fire, and illegal substances A proprietary base station processes information over a...

read Nov 11, 2024

The Pentagon has a new autonomous AI-powered machine gun called ‘Bullfrog’

The U.S. military's latest counter-drone initiative represents a significant advancement in autonomous defense systems, combining artificial intelligence with traditional weaponry to address emerging aerial threats. The innovation: The Bullfrog system, developed by Allen Control Systems, integrates artificial intelligence and computer vision technology with a conventional 7.62-mm M240 machine gun mounted on a rotating turret. The system demonstrates remarkable precision in targeting and neutralizing small unmanned aerial vehicles (UAVs), requiring only minimal ammunition to achieve successful hits Weighing less than 400 pounds, the Bullfrog offers superior mobility compared to existing counter-drone platforms The current configuration maintains human oversight for firing decisions,...

read Nov 11, 2024

New study investigates performance of AI language and vision models in healthcare

The growing adoption of AI language and vision models in healthcare has sparked critical research examining their reliability, biases, and potential risks in medical applications. Core research findings: Several major studies conducted by leading medical institutions have revealed important insights about the performance of large language models (LLMs) and vision-language models (VLMs) in healthcare settings. Research shows that seemingly minor changes, like switching between brand and generic drug names, can reduce model accuracy by 4% on average Models demonstrated concerning biases when handling complex medical tasks, particularly in oncology drug interactions Most LLMs failed to identify logical flaws in medical...

read Nov 9, 2024

How Roboflow saved 74 years of developer time with Meta’s SAM model

Meta's Segment Anything Model (SAM) has transformed the landscape of image segmentation, dramatically reducing the time and effort required to create training data for AI models. This innovation has far-reaching implications across various industries and applications. Key developments in SAM technology: Meta released the first SAM model in 2023, enabling flexible interactive and automatic image segmentation. SAM 2, launched in July 2024, expanded capabilities to include real-time, promptable object segmentation for both images and videos. The open-source nature of SAM has fostered collaboration and continuous improvement, leading to significant advancements between versions. Quantifying the impact: Roboflow, a company leveraging SAM...

read Nov 9, 2024

Swarovski unveils the world’s first AI-enabled binoculars

Innovative AI-powered binoculars revolutionize wildlife observation: Swarovski Optik has introduced the AX Visio, the world's first AI-enabled binoculars capable of identifying over 9,000 bird species in real-time, along with some mammals and insects. The AX Visio, co-developed with industrial designer Marc Newson, features an onboard computer, built-in camera, and computer vision software for species identification. Priced at €4,600 (approximately $5,000), these binoculars offer a unique combination of traditional optics and cutting-edge AI technology. The device uses image-recognition models and GPS data to identify wildlife, with different settings for birds, mammals, butterflies, and dragonflies. Technology and functionality: The AX Visio combines...

read Nov 6, 2024

Apple’s Visual Intelligence rivals Google Lens — here’s how to use it

Apple's latest AI innovation: Visual Intelligence, a feature of iOS 18.2 developer beta, brings Google Lens-like functionality to iPhone 16 models, enhancing the way users interact with their surroundings through their device's camera. Key features and functionality: Visual Intelligence allows users to perform a variety of tasks using their iPhone's camera, including object description, price lookup, and business information retrieval. The feature utilizes ChatGPT to provide detailed descriptions of captured images and can answer follow-up questions. Visual Intelligence can extract important information like email addresses and phone numbers from images. Users can opt to use Google search instead of ChatGPT...

read Nov 5, 2024

This AI model transforms sketches into playable games

AI-powered game generation breakthrough: Oasis, a new AI model developed by Decart in collaboration with Etched, has demonstrated the ability to generate playable Minecraft-like gameplay from images in real-time. Oasis is described as "the world's first real-time AI world model," processing user input and visual data to create dynamic gaming experiences. The system generates gameplay frames at approximately 20 frames per second, simulating physics, game rules, and graphics internally. Netflix is exploring generative AI for gameplay, even closing its own game studio to focus on this technology. Technical innovations behind Oasis: The AI model combines Vision Transformer technology with a...

read Nov 4, 2024

Nvidia’s new AI agents can search and summarize huge quantities of visual data

AI-powered visual search and summarization: NVIDIA has introduced a new AI Blueprint that enables developers to create visual AI agents capable of analyzing vast amounts of video and image content across various industries. The NVIDIA AI Blueprint for video search and summarization combines computer vision and generative AI technologies to create customizable workflows for building visual AI agents. These agents can answer user questions, generate summaries, and provide alerts for specific scenarios based on visual data from cameras, IoT sensors, and vehicles. The technology is part of NVIDIA Metropolis, a set of developer tools for creating vision AI applications. Industry...

read Nov 4, 2024

Microsoft Copilot Vision nears launch — here’s what we know right now

Microsoft unveils Copilot Vision: Microsoft is set to launch Copilot Vision, an AI-powered feature that will allow its Copilot assistant to visually analyze users' on-screen content. After a month-long trial with select users through Copilot Labs, Microsoft is preparing to roll out Copilot Vision to all users. The feature will be integrated into the Microsoft Edge browser, accessible via a screen-like icon. Copilot Vision enables the AI to observe and respond to on-screen content, including websites, documents, and both typed and handwritten text. Enhanced user experience: Copilot Vision aims to streamline user interactions by providing contextual assistance without the need...

read Nov 4, 2024

UC Berkeley study uses AI to confirm rising Hollywood diversity

AI-powered study confirms Hollywood's increasing diversity: Researchers at UC Berkeley have used facial recognition technology to analyze on-screen representation in over 2,300 films, revealing a trend towards greater diversity in Hollywood since 2010. The study, published in the Proceedings of the National Academy of Sciences, examined 4,412 hours of footage from both popular and prestige films released between 1980 and 2022. Researchers found increased representation for women, Black, Hispanic/Latino, East Asian, and South Asian actors, particularly after 2010. The diversity increase is not limited to a few films with all non-white casts but is evident across individual movies as well....

read Nov 4, 2024

How AI will allow museums to track visitors and offer personalized experiences

A new deep learning-based system for tracking museum visitors is transforming how cultural institutions engage with their audience and optimize their exhibitions. The technology behind the innovation: The system employs Convolutional Neural Networks (CNNs) and commercially available RGB cameras to track visitors wearing simple badges, offering a cost-effective and non-intrusive solution for behavior analysis. The technology identifies visitors' movement patterns and interactions with specific exhibits, providing valuable data on engagement levels. By leveraging machine learning, the system can integrate collected data to create personalized recommendations for visitors, addressing psychological needs for autonomy and competence. This approach allows for real-time, unobtrusive...

read Nov 4, 2024

This startup is using AI to help patients decode their X-rays

AI-powered dental imaging revolutionizes patient understanding: Overjet, a Boston-based startup, is launching Iris, an innovative imaging system that transforms dental X-rays into clear, annotated pictures to improve communication between dentists and patients. Overjet's CEO, Wardah Inam, highlights that patients often opt out of recommended dental procedures due to a lack of understanding, with more than half of patients declining treatment. The Iris system aims to address this issue by utilizing Overjet's AI platform, which has been trained on millions of dental images to enhance X-rays and add visual elements that clearly highlight potential dental health problems. Advanced AI capabilities and...

read Nov 1, 2024

This startup aims to help surgeons detect breast cancer with 3D imaging

Revolutionizing breast cancer surgery with AI-powered 3D visualization: SimBioSys, an Illinois-based startup, has developed TumorSight, an innovative technology that transforms routine MRI images into detailed, color-coded 3D models of breast tumors and surrounding tissue, aiming to improve surgical planning and outcomes for breast cancer patients. Key features of TumorSight: Converts black-and-white MRI images into spatially accurate, volumetric 3D visualizations Uses distinct colors to highlight different breast structures (e.g., red for veins, blue for tumors, gray for surrounding tissue) Allows surgeons to manipulate the 3D model on a computer screen for better insights Calculates crucial measurements such as tumor volume and...

read