NTT’s Tsuzumi AI Learns to Read Images; Expected to Be Used for Reading Foreign Maps, Menus

Yomiuri Shimbun file photo
NTT Corp.’s head office in Chiyoda Ward, Tokyo

SAN FRANCISCO — NTT Corp. has created an advanced image reading technology to improve the functionality of its tsuzumi generative AI, the company has said. Tsuzumi is set to be fully commercially available by the end of fiscal 2024.

Generative AIs have had trouble understanding visual information such as charts and diagrams. NTT’s new technology will be able to read articles, websites and contracts that contain charts or illustrations, and produce answers about such images or summarize them.

NTT expects the technology to be used to summarize foreign maps and restaurant menus captured by smartphone, and said that accuracy is comparable to Google LLC’s latest generative AI platform, Gemini. NTT also plans to develop voice and other recognition technologies.

In March, NTT started offering tsuzumi, which has a high level of Japanese language ability, to companies and local governments in Japan. NTT said it has so far received more than 500 inquiries about the AI.