News/Machine Vision
Meta’s new Llama AI model can now see and run on your device
Llama 3.2 Introduces Multimodal and On-Device Models: Meta's latest update to its Llama language model series brings significant advancements in AI capabilities, including vision processing and compact on-device models. Key Features and Enhancements: The Llama 3.2 release incorporates new multimodal vision models and smaller language models optimized for on-device applications, expanding the versatility and accessibility of AI technologies. Two sizes of vision models (11B and 90B parameters) are now available, each with base and instruction-tuned variants, enabling the processing of both text and images in tandem. New 1B and 3B parameter text-only models have been introduced, designed specifically for on-device...
read Sep 25, 2024Google Earth unveils time travel feature as Maps gets major overhaul
Google Earth and Maps receive major updates: Google has announced significant enhancements to Google Earth and Google Maps, aimed at improving user experience and image quality. Historical imagery comes to Google Earth: Users can now explore Google's extensive satellite and aerial imagery library, dating back up to 80 years. The feature showcases changes over time in cities like London, Berlin, Warsaw, and Paris, with some imagery dating back to the 1930s. Google Earth's home screen has been redesigned to facilitate easier collaboration for researchers and organizations working on projects. Street View expansion: Google Maps is launching one of its biggest...
read Sep 24, 2024This New AI Mobile App Can Identify 33,000 Species of Plants in Just 3 Seconds
AI-powered plant identification app revolutionizes gardening: Plantum, a new mobile application, uses artificial intelligence to identify over 33,000 plant species with 98% accuracy in just three seconds, offering a comprehensive solution for plant care and identification. Plantum transforms iPhones and iPads into portable plant experts, providing users with instant access to botanical knowledge and care instructions. The app is currently available at a discounted lifetime subscription price of $14.97, down from its regular price of $59, making it an affordable option for both novice and experienced gardeners. Key features and functionality: Plantum offers a range of tools designed to assist...
read Sep 24, 2024Hitachi Adopts NVIDIA Tech to Power Railway Operations with AI
Hitachi Rail leverages NVIDIA AI for railway innovation: The global transportation company is integrating NVIDIA's IGX platform to enhance railway operations, reduce maintenance costs, and improve energy efficiency. Transforming railway operations with real-time analysis: Hitachi Rail's new HMAX platform, powered by NVIDIA IGX, processes sensor and camera data in real-time, enabling swift detection of maintenance needs and infrastructure issues. The HMAX platform allows for immediate identification of train tracks requiring repair, monitoring of overhead power line degradation, and assessment of train and signaling equipment health. Real-time analysis aims to further reduce service delays, decrease train maintenance costs, and cut energy...
read Sep 21, 2024Leading Medical Centers Tap AI for Tumor Detection Project
Advancing cancer detection with AI and federated learning: A committee of experts from leading U.S. medical centers and research institutes is leveraging NVIDIA-powered federated learning to enhance AI models for tumor segmentation. The project aims to evaluate the impact of federated learning and AI-assisted annotation on training AI models for more accurate cancer detection. Federated learning allows organizations to collaborate on AI model development without compromising data security or privacy, as sensitive data remains on local servers. The technique is particularly valuable in medical imaging, where privacy constraints and rapid AI development make traditional data-sharing methods increasingly challenging. Key participants...
read Sep 20, 2024New Study Shows AI Can Predict Lung Cancer from Digitized Tissue Samples
Artificial intelligence advances lung cancer prediction: A new study published in Cell Reports Medicine demonstrates how AI can accurately predict lung cancer from digitized patient tissue samples, showcasing a promising application of machine learning in medical diagnostics. Key findings and implications: Researchers from the University of Cologne developed an AI-based computational pathology platform capable of analyzing hematoxylin and eosin (H&E)-stained tissue sections for non-small cell lung cancer (NSCLC). The AI algorithm outperformed previous studies in constructing precise segmentation maps, achieving a Dice score of 88.5% for epithelial-only tumor segmentation. This study marks the first AI-based algorithm for necrosis density quantification...
read Sep 19, 2024New EV Charging Station Uses AI and Face ID to Eliminate User Interaction
Revolutionary EV charging technology debuts in NYC: Revel, a Brooklyn-based charging network, has integrated Juice's AI-powered automatic charging and payment system into its DC fast chargers, marking a significant advancement in EV charging convenience. The technology behind the innovation: Juice's system utilizes computer vision, AI, and machine learning to identify vehicles, initiate charging sessions, and process payments without user interaction with physical devices or apps. Unlike existing technologies such as Plug & Charge or Autocharge, Juice's solution doesn't require manufacturer-specific integrations, making it compatible with all EV models. The system can recognize individual vehicles in a manner similar to facial...
read Sep 10, 2024New AI Framework ChartEye Will Extract Info From Any Chart
Innovative framework for automated chart analysis: ChartEye, a new deep learning framework, offers a comprehensive solution for extracting information from charts and infographics, addressing the complex challenges in automated chart understanding. Developed by researchers Osama Mustafa, Muhammad Khizer Ali, Momina Moetesum, and Imran Siddiqi, ChartEye tackles multiple tasks in the chart information extraction process. The framework utilizes advanced machine learning techniques, including hierarchical vision transformers and YOLOv7, to perform chart-type classification, text-role classification, and text detection. To improve optical character recognition (OCR) accuracy, ChartEye employs Super Resolution Generative Adversarial Networks (SR-GANs) to enhance detected text. Key performance metrics: Experimental results...
read Sep 9, 2024Luxury Brands are Using AI and Computer Vision to Fight Counterfeits
The rise of AI in luxury brand authentication: Computer Vision (CV), a branch of artificial intelligence, is emerging as a powerful tool in the fight against counterfeit luxury goods, particularly in the booming sneaker and watch markets. The global luxury goods market is projected to reach $385 billion by 2025, according to Bain & Company, highlighting the critical need for advanced authentication methods. The sneaker resale market is expected to grow to $30 billion by 2030, as reported by Cowen Equity Research, making it a prime target for counterfeiters. The counterfeit shoe market has seen an astounding 1200% increase from...
read Sep 7, 2024Capx AI Launches 8B-Parameter Multimodal Vision Model
Groundbreaking multimodal AI model unveiled: Capx AI has released Llama-3.1-vision, an 8 billion parameter Vision model that combines Meta AI's Llama 3.1 8B language model with the SigLIP vision encoder. The model, released under the Apache 2.0 License, is designed to excel in instruction-following tasks and create rich visual representations. Built upon BAAI's Bunny repository, the architecture consists of a vision encoder, a connector module, and a language model. The model leverages Low-Rank Adaptation (LoRA) for efficient training on limited computational resources. Innovative two-stage training approach: The development process involved a pretraining stage to align visual and text embeddings, followed...
read Sep 5, 2024Select Users Can Now Use Natural Language to Search Google Photos
Google's AI-powered photo exploration: Google is rolling out a new "Ask Photos" feature to select Google Labs users in the US, leveraging its Gemini AI models to enable natural language interactions with users' photo libraries. The feature allows users to ask questions about their photos, such as inquiring about specific events or locations captured in images. Users can also request task completion, like summarizing vacation activities or selecting the best family pictures for a shared album. Google is offering a waitlist for those interested in accessing the Ask Photos feature. Enhanced search capabilities: Alongside the new AI assistant, Google Photos...
read Sep 5, 2024Zillow’s AI Revolutionizes Home Search With Natural Language Queries
Zillow's AI revolutionizes home search: Zillow has upgraded its AI search tool to simplify the process of finding a dream home by allowing users to use natural language queries instead of traditional filtering methods. The new AI-powered search feature is available on Zillow's mobile app and aims to make the home search process more intuitive and user-friendly. Users can now input queries like "homes 30 minutes from Millennium Park" or "3-bedroom houses near Roosevelt High School," eliminating the need for manual filtering of locations and other elements. The AI tool is designed to understand and process casual language, making it...
read Aug 30, 2024Fetch.ai Launches $10M AI Agent Innovation Hub in San Francisco
Fetch.ai launches innovation hub to accelerate AI agent development: The company has established a new lab in San Francisco to foster the creation of AI agent solutions for businesses and consumers, backed by a $10 million annual funding commitment. Key features of the innovation hub: Fetch.ai's new initiative aims to provide comprehensive support for startups and developers working on AI agent technology. The lab will offer funding, expert mentorship, and strategic support to help founders navigate the complexities of AI development. Competitions and fast-track development opportunities will be available to accelerate innovation in the field. The hub focuses on supporting...
read Aug 29, 2024Google AI Simulates Doom in Real-Time Without Original Code
AI-powered game simulation: Google researchers have developed an AI model called GameNGen that can simulate the classic PC game Doom in real-time without using the game's original code. The model uses AI image generation technology to create game visuals at over 20 frames per second, offering a playable experience. Researchers from Google and Tel Aviv University collaborated on this project, demonstrating that neural networks can run complex games in real-time with high quality. How GameNGen works: The AI model leverages Stable Diffusion version 1.4, an open-source image generator, to create game visuals based on player inputs and game state updates....
read Aug 25, 2024AI Decodes 3,000-Year-Old Scrolls, Unveiling Ancient Wisdom
Ancient texts brought to life: Advanced AI technology is enabling researchers to decipher previously unreadable 3,000-year-old papyrus scrolls from Herculaneum, a Roman town destroyed by Mount Vesuvius. Historical context and archaeological significance: The Villa of the Papyri in Herculaneum has yielded a treasure trove of ancient knowledge, preserved in an unexpected form. In 1752, archaeologists discovered 1,785 papyrus scrolls in a residential complex near Pompeii, now known as the Villa of the Papyri. Herculaneum, a coastal retreat for elite Romans, was better preserved than Pompeii due to its location and the nature of the volcanic eruption. The scrolls represent the...
read Aug 19, 2024SK Telecom Unveils Edge AI-Powered Autonomous Robots for Indoor Use
Breakthrough in autonomous robotics: SK Telecom has successfully demonstrated advanced autonomous driving robot technology powered by its innovative Telco Edge AI infrastructure, marking a significant step forward in indoor robotics and edge computing applications. Test details and technology showcase: The two-month trial, conducted at SK Telecom's Pangyo building near Seoul, focused on indoor product transport and delivery robots that require high-precision positioning in complex environments. The autonomous robots utilized a combination of sensors, including cameras and Inertial Measurement Units (IMUs), to navigate the intricate interior spaces. SK Telecom's proprietary Visual Localization And Mapping (VLAM) technology, an image-based sensor fusion positioning...
read Aug 15, 2024The Premier League is Using AI and iPhones to Make Offsides Decisions
The Premier League is set to revolutionize its offside detection technology, moving away from the current VAR system to a more advanced and precise solution developed by Genius Sports. A technological leap in football officiating: The Premier League is preparing to implement a new offside detection system called "Semi-Assisted Offside Technology" (SAOT), marking a significant advancement in the use of technology in football. Developed by Genius Sports, SAOT aims to provide more accurate offside decisions compared to the current VAR system. The new technology is expected to be rolled out before the end of the year and will continue to...
read Aug 15, 2024AI Deciphers Ancient Cuneiform Texts With 97% Accuracy
Revolutionizing ancient text analysis: Natural language processing techniques are being applied to automate the transliteration and segmentation of Akkadian cuneiform texts, potentially transforming the field of Assyriology. Researchers have developed a new method using machine learning models, particularly recurrent neural networks, to transliterate and segment cuneiform characters into words with up to 97% accuracy. This innovative approach significantly accelerates the process of creating digitized editions of cuneiform texts, a task that has traditionally been time-consuming and labor-intensive. The research team trained their models on a corpus of Neo-Assyrian royal inscriptions, demonstrating the potential for broad application across different periods and...
read Aug 14, 2024Advanced AI Analyzes Facial Cues to Predict Health Issues
AI-powered visual early warning systems are emerging as a powerful tool for detecting subtle signs of health deterioration, offering potential to transform patient monitoring and early intervention across healthcare settings. Breakthrough in health monitoring: AI-based visual early warning systems can detect early signs of health deterioration with remarkable 99.89% accuracy by analyzing facial expressions and subtle cues. This cutting-edge technology leverages advanced machine learning techniques, including Convolutional Neural Networks and Long Short-Term Memory models, to analyze both spatial and temporal features in facial expressions. The system's ability to continuously monitor patients non-invasively opens up new possibilities for proactive healthcare management...
read Aug 10, 2024AI Breakthrough Turns Scanned Docs into Machine-Readable Text
Innovative OCR enhancement through AI: The LLM-Aided OCR Project represents a significant advancement in Optical Character Recognition technology by integrating large language models to improve accuracy and readability of digitized text. The project combines traditional OCR techniques with state-of-the-art natural language processing to transform raw scanned text into high-quality, well-formatted documents. Key features include PDF to image conversion, Tesseract OCR integration, and advanced error correction using both local and cloud-based LLMs. The system offers flexible configuration options, including markdown formatting and the ability to suppress headers and page numbers. Technical architecture and processing pipeline: The LLM-Aided OCR Project employs a...
read Aug 8, 2024AI Model Generates Cognitive Maps From Visual Data Alone
Cognitive maps, a cornerstone of spatial navigation and memory, have long fascinated researchers in neuroscience and artificial intelligence. A groundbreaking study published in Nature Machine Intelligence demonstrates how a self-attention neural network can generate environmental maps from visual inputs alone, potentially shedding light on both biological and artificial spatial cognition processes. Revolutionary approach to spatial mapping: The study introduces a computational model that constructs cognitive map-like representations solely from visual inputs, without relying on explicit spatial information. This breakthrough addresses a significant challenge in both neuroscience and AI: the ability to create accurate spatial maps from sensory inputs. The model's...
read