Machine Vision

News/Machine Vision

Nov 1, 2024

Scientists are using AI on tourist photos to track Antarctic penguins

AI-powered penguin tracking in Antarctica: Scientists have developed a novel approach to monitor penguin colonies in Antarctica by leveraging artificial intelligence to analyze tourist photographs. A team led by researchers from Stony Brook University in New York has successfully used AI to transform tourist photos into a 3D digital map of Antarctic penguin colonies. The innovative technique combines AI image analysis with 3D landscape modeling to track changes in penguin populations and locations over time. This method could prove especially valuable in remote regions where traditional aerial surveys are infrequent. Technological approach and efficiency: The AI-assisted method significantly streamlines the...

read Nov 1, 2024

Google’s new ‘InkSight’ tool turns handwritten notes into digital text

Bridging the analog-digital divide: Google Research's new AI system, InkSight, represents a significant breakthrough in converting handwritten notes to editable digital text, potentially transforming how millions capture and preserve their thoughts. InkSight accurately converts photographs of handwritten notes into digital text, addressing the longstanding challenge of bridging traditional handwriting with digital note-taking. The system combines sophisticated AI capabilities to read, understand, and reproduce text naturally, moving beyond previous methods that relied on analyzing geometric properties of written strokes. In human evaluations, 87% of InkSight's samples were considered valid tracings of input text, with 67% being indistinguishable from human-generated digital handwriting....

read Oct 30, 2024

Waymo is using Google’s Gemini to train its robotaxis

Waymo's AI innovation in autonomous driving: Waymo, the Alphabet-owned autonomous vehicle company, is developing a new training model for its robotaxis built on Google's multimodal large language model (MLLM) Gemini, signaling a potential breakthrough in the application of AI to self-driving technology. Waymo has introduced EMMA (End-to-End Multimodal Model for Autonomous Driving), a new end-to-end training model that processes sensor data to generate future trajectories for autonomous vehicles. This development represents one of the first indications that a leader in autonomous driving is exploring the use of MLLMs in its operations, potentially expanding the application of large language models beyond...

read Oct 29, 2024

New algorithm enables driverless cars to avoid hitting unseen pedestrians

AI advances in driverless car safety: A new algorithm developed by VERSES AI, a California-based cognitive computing company, aims to improve how autonomous vehicles predict and respond to hidden objects and unpredictable movements on the road. The algorithm enhances driverless cars' ability to anticipate the sudden appearance of vehicles, cyclists, and pedestrians that may be initially out of sight. This development addresses a critical challenge in autonomous driving technology: accurately predicting the behavior of road users who are temporarily obscured from the vehicle's sensors. Key innovation - occlusion reasoning: The algorithm incorporates occlusion reasoning, a technique that helps autonomous systems...

read Oct 29, 2024

SimpliSafe’s new AI camera detects threats at your doorstep

AI-powered home security innovation: SimpliSafe, a Boston-based company, is introducing a new outdoor security camera that utilizes artificial intelligence to assess potential threats at the doorstep. The camera aims to detect danger before it occurs, moving beyond traditional systems that react only after a break-in has happened. SimpliSafe's CEO, Christian Cerda, envisions a future where the system could autonomously determine friend from foe, though currently it still relies on human agents for decision-making. How the system works: The new SimpliSafe camera combines AI technology with human monitoring to provide a more proactive approach to home security. The camera captures video...

read Oct 29, 2024

Don’t have an iPhone? Here are 5 other tools that also let you use ‘visual AI’

Visual AI revolution in smartphones: Apple's Visual Intelligence, introduced with the iPhone 16, marks a significant advancement in smartphone capabilities, integrating machine learning and advanced image processing for real-time image analysis and information retrieval. Visual Intelligence allows users to analyze images using AI, identifying objects and text within photos and initiating contextual searches directly from the iOS 18 interface. The feature is designed to make searches and information retrieval more accessible and intuitive, streamlining everyday interactions with images. Alternatives for older devices: For users with older phones lacking access to Visual Intelligence, several free AI tools offer similar functionalities for...

read Oct 26, 2024

AI weapons scanners fail to detect any guns in NYC subway test

AI-powered subway scanners fall short in New York City pilot: A recent trial of artificial intelligence-driven weapons detection technology in New York City's subway system yielded disappointing results, raising questions about the efficacy and feasibility of such security measures in mass transit. Key findings of the pilot program: The 30-day test of AI-powered scanners across 20 subway stations revealed significant limitations in the technology's ability to accurately detect firearms. The scanners performed 2,749 scans but failed to detect any firearms during the trial period. A concerning 118 false positives were recorded, resulting in a 4.29% false alarm rate. The system...

read Oct 23, 2024

Cohere just gave the power of vision to its RAG search offering

Cohere enhances RAG search with multimodal capabilities: Cohere has upgraded its Embed 3 model to include multimodal embeddings, allowing for image-based retrieval augmented generation (RAG) in enterprise search. Key features of the new Embed 3 model: Generates embeddings for both images and text Utilizes a unified latent space for encoders, enabling mixed modality searches Available in over 100 languages Accessible on Cohere's platform and Amazon SageMaker Expanding enterprise data accessibility: Enables businesses to search complex reports, product catalogs, and design files Increases the volume of data accessible through RAG search Allows incorporation of charts, graphs, product images, and design templates...

read Oct 20, 2024

UCLA’s new AI model may open the door to personalized medicine

Breakthrough in AI-powered medical imaging analysis: UCLA researchers have developed a revolutionary AI model called SLIViT that can rapidly and accurately analyze 3D medical images across various modalities, potentially transforming disease diagnosis and treatment planning. Key features and capabilities: SLIViT (SLice Integration by Vision Transformer) can analyze retinal scans, ultrasound videos, CTs, MRIs, and other imaging types The model identifies potential disease-risk biomarkers with high accuracy across a wide range of diseases It outperforms many existing disease-specific foundation models SLIViT uses a novel pre-training and fine-tuning method based on large, accessible public datasets Potential impact on healthcare: The model could...

read Oct 19, 2024

Amazon boosts electric delivery vans with AI vison

AI-powered efficiency for Amazon's electric delivery fleet: Amazon's new Vision-Assisted Package Retrieval (VAPR) system aims to streamline the package retrieval process for delivery drivers, potentially revolutionizing the logistics industry. VAPR uses artificial intelligence and computer vision to identify and highlight packages for each delivery stop, projecting green "O" symbols on correct packages and red "X" symbols on others. The system provides audible cues to ensure drivers don't leave packages behind, eliminating the need for manual package organization and label checking. Amazon's Transportation team reports significant improvements in early tests, including a 67% reduction in perceived driver effort and over 30...

read Oct 18, 2024

H2O.ai launches 2 vision AI models for better document analysis

AI innovation in document analysis: H2O.ai, an open-source AI platform provider, has introduced two new vision-language models that challenge larger models from tech giants in document analysis and optical character recognition (OCR) tasks. H2O.ai's new models, H2OVL Mississippi-2B and H2OVL-Mississippi-0.8B, demonstrate competitive performance against much larger models from major tech companies. The H2OVL Mississippi-0.8B model, with only 800 million parameters, outperformed all other models on the OCRBench Text Recognition task. The 2-billion parameter H2OVL Mississippi-2B model showed strong general performance across various vision-language benchmarks. Efficiency and accessibility: H2O.ai's approach focuses on creating smaller, specialized models that offer high performance while...

read Oct 17, 2024

Newton AI model learns physics autonomously from raw data

Breakthrough in AI-driven physics understanding: Archetype AI's Newton model represents a significant advancement in artificial intelligence's ability to comprehend and predict complex physical phenomena using only raw sensor data. Newton, developed by researchers at Archetype AI, can learn intricate physics principles without any pre-programmed knowledge or human guidance. The model demonstrates remarkable generalization capabilities across diverse physical phenomena, relying solely on raw sensor measurements as input. Trained on over half a billion data points from various sensor measurements, Newton showcases an unprecedented ability to adapt to new domains with minimal additional training. Impressive performance across diverse applications: Newton's versatility and...

read Oct 14, 2024

4D LiDAR startup Lidwave secures $10M for machine vision advances

Lidwave's breakthrough in LiDAR technology: Lidwave, a startup focused on machine vision, has secured $10 million in funding to develop and commercialize its innovative 4D LiDAR-on-chip technology. Jumpspeed Ventures and Next Gear Ventures led the funding round, with additional investment from a leading Swedish truck manufacturer and other strategic partners. The funding will be used to further develop Lidwave's optical chip, launch the industry's first software-definable 4D LiDAR sensor, and expand the company's market presence. Lidwave's technology aims to revolutionize machine vision by making advanced perception technology more accessible and affordable for mass-market applications. The challenge in current LiDAR systems:...

read Oct 9, 2024

Wimbledon embraces AI, ditches line judges after 147 years

AI Takes Center Court at Wimbledon: The All England Lawn Tennis Club (AELTC) has announced a significant technological shift for the 2025 Wimbledon Championships, replacing human line judges with artificial intelligence across all 18 courts. After 147 years of tradition, Wimbledon will implement electronic line calling (ELC) technology to make "out" and "fault" calls, marking the end of an era for the iconic cream berets and navy blazers of human line judges. The decision aligns Wimbledon with other major tennis tournaments, such as the US Open and Australian Open, which have already adopted similar AI-powered officiating systems. This move reflects...

read Oct 9, 2024

Ring’s new AI-powered Smart Video Search impresses with stunning accuracy

AI-powered Smart Video Search revolutionizes Ring camera functionality: Amazon's Ring is introducing a groundbreaking AI feature that will significantly enhance the search capabilities of its video history, offering users unprecedented precision in locating specific events captured by their cameras. Key features and functionality: The new Smart Video Search employs advanced AI technology to allow users to search for highly specific situations within their Ring camera footage. Users can input detailed search queries like "UPS truck" or even more specific requests such as "raccoon at night in the rain" to quickly locate relevant video clips. The feature utilizes Ring IQ, a...

read Oct 8, 2024

Apple’s new ‘Depth Pro’ AI model could revolutionize augmented reality

Breakthrough in 3D mapping technology: Apple has unveiled a groundbreaking AI model called Depth Pro, capable of rapidly generating accurate depth maps from single images in real-time, potentially transforming various industries and applications. Depth Pro can estimate both relative and absolute depth, producing "metric depth" data from a single image input. The AI model operates efficiently on standard home computing hardware, eliminating the need for specialized AI chips. Apple's technology demonstrates the ability to draw accurate measurements between objects within an image. Depth Pro avoids common inconsistencies, such as misidentifying the sky or confusing foreground and background elements. Potential applications...

read Oct 4, 2024

Google Lens now lets you search with videos too

Google Lens expands search capabilities with video and voice features: Google has introduced new functionalities to its Lens app, allowing users to search using video and voice commands, enhancing the visual search experience. The update, rolling out in Search Labs on Android and iOS, enables users to record short videos and ask questions about what they're seeing. Google's Gemini AI model processes the video content and user queries to provide relevant responses and search results. The new feature builds upon Google's existing image search capabilities, applying computer vision techniques to analyze multiple video frames in sequence. How it works: Users...

read Oct 3, 2024

AI-powered cameras now monitor drivers in the UK

New AI camera monitors driving offenses in Plymouth: The city has deployed an advanced camera system on Tavistock Road in Derriford to detect traffic violations using artificial intelligence technology. The AI-powered camera captures images of passing vehicles to identify two specific offenses: drivers using mobile phones and passengers not wearing seatbelts. Plymouth City Council emphasizes that while AI detects potential violations, all images are subsequently reviewed by a human operator for verification. Depending on the offense, drivers may receive either a warning letter or a notice of intended prosecution. Law enforcement perspective: Adrian Leisk, head of road safety for Devon...

read Oct 3, 2024

NVIDIA unveils AI breakthroughs at European vision conference

NVIDIA Research showcases AI breakthroughs at ECCV: NVIDIA's research team is presenting cutting-edge innovations in computer vision and AI at the European Conference on Computer Vision (ECCV) in Milan, with a focus on automotive applications and embodied AI. Key automotive research highlights: NVIDIA's presentations at ECCV include several groundbreaking developments in automotive-related AI technologies. RealGen, a novel framework for traffic scenario generation, uses retrieval-augmented generation to synthesize new scenarios by combining behaviors from multiple retrieved examples. The NeRFect Match explores the use of Neural Radiance Field (NeRF) features for visual localization, a crucial capability for autonomous driving applications. Dolphins, a...

read Oct 2, 2024

DroneDeploy launches ‘Safety AI’ to protect against construction site hazards

New AI tool aims to revolutionize construction site safety: DroneDeploy has launched Safety AI, an innovative solution designed to automatically identify and prioritize safety risks on construction sites using advanced artificial intelligence technology. Safety AI integrates with DroneDeploy's Ground platform, analyzing thousands of images captured weekly to detect visible safety hazards. The tool ranks identified risks by severity, allowing safety teams to address potential hazards more efficiently and effectively. DroneDeploy claims Safety AI achieves a 95% accuracy rate in identifying risks based on OSHA standards. Addressing a critical industry need: The construction industry faces significant safety challenges, with accident rates...

read Oct 2, 2024

How Pano AI is using AI-powered cameras to predict wildfires

Revolutionary wildfire detection technology: Pano AI, a San Francisco-based startup, is transforming early fire detection with its innovative AI-powered camera system, offering a promising solution to the growing threat of wildfires. Founded in 2020, Pano AI has quickly established itself as a leader in wildfire detection technology, monitoring nearly 20 million acres worldwide and identifying almost 100,000 fires to date. The company's system combines ultra-high-definition cameras, infrared sensors, and deep-learning algorithms to detect smoke and other signs of fire across vast territories. Human analysts review potential fire detections to confirm outbreaks and eliminate false positives, ensuring rapid and accurate alerts...

read Sep 30, 2024

MIT research investigates how AI models really perceive faces

Groundbreaking study explores AI pareidolia: MIT researchers have conducted an extensive study on pareidolia, the phenomenon of perceiving faces in inanimate objects, revealing significant insights into human and machine perception. Key findings and implications: The study introduces a comprehensive dataset of 5,000 human-labeled pareidolic images, uncovering surprising differences between human and AI face detection capabilities. Researchers discovered that AI models struggle to recognize pareidolic faces in the same way humans do, highlighting a gap in machine perception. Training algorithms to recognize animal faces significantly improved their ability to detect pareidolic faces, suggesting a potential evolutionary link between animal face recognition...

read Sep 30, 2024

How AI-powered diagnostics are reshaping the future of radiology

AI revolutionizing radiology diagnostics: Artificial intelligence is transforming the field of radiology, enhancing the speed and accuracy of diagnoses while reducing the burden on radiologists and improving patient outcomes. Companies like Qure.ai, Arterys, DeepMind (now part of Google), and Cleerly are developing AI-powered tools for medical imaging analysis. These AI systems can process millions of medical images, including chest X-rays, CT scans, and MRIs, to detect diseases such as tuberculosis, lung cancer, and stroke. The technology is particularly valuable in resource-limited areas where access to radiologists is scarce. Qure.ai's innovative approach: Qure.ai's AI-powered tools are at the forefront of this...

read Sep 29, 2024

New research to employ Drones and AI to boost cyclist safety in Boston

Innovative approach to cyclist safety: Researchers from the University of Massachusetts are employing drones and artificial intelligence to study and improve bike lane safety in Somerville, Massachusetts. The study, taking place near Porter Square, uses drones to record traffic patterns and interactions between cyclists and vehicles from hundreds of feet above the streets. AI software will analyze the video footage to generate data for safety improvement recommendations, including changes to bike lanes, signage, and barriers. The research aims to identify effective safety measures and suggest potential driver training programs to enhance cyclist protection. Context of cycling safety concerns: The study...

read