Friday, May 1, 2026
Privacy-First Edition
Back to NNN
Technology

‘The whale can now see’: DeepSeek adds AI vision in major move

On DeepSeek’s chat interface, a new ‘image recognition mode’ has been added alongside the ‘expert’ and ‘flash’ chat modes

2-MIN READ2-MINVincent ChowPublished: 10:00pm, 29 Apr 2026Chinese artificial intelligence start-up DeepSeek has added multimodal capabilities to its flagship chatbot for the first time – meaning that it can process images and video in addition to text – bringing it in line with rivals that already offer the function.The limited release to select users comes just days after the Hangzhou-based company released its new flagship model V4, which was followed by extensive price cuts.

According to DeepSeek multimodal team leader Chen Xiaokang, who made the announcement on Wednesday on X, the function was initially offered to select users on DeepSeek’s chatbot website and mobile application for beta testing.

On DeepSeek’s chat interface, a new “image recognition mode” had been added alongside the “expert” and “flash” chat modes, which were introduced earlier this month.

As AI continues to rapidly progress, multimodal capabilities are viewed as a necessity to move beyond simple text conversations with users into more complex and economically valuable domains.

Read original at South China Morning Post

The Perspectives

0 verified voices · Three viewpoints · Real discourse

Left
0
Be the first to share a left perspective
Center
0
Be the first to share a center perspective
Right
0
Be the first to share a right perspective

Related Stories