Technology already exists to read text from signs etc. within video images. In fact there is an iPhone application that not only reads the text, but translates it and overlays the translated text over the original, in real time. There are even free online services that will convert from your camera image.
Having optical character recognition derive signage information will help determine location more accurately, and may provide additional metadata.
But the technology is going beyond that. For example Calrifai has an online programming interface (API) that allows apps to build on their technology by uploading images and getting back keywords that describe the content of the image or video. This is valuable Content Metadata, although Clarifai tends to return more keywords than are useful, but there are ways to deal with that.
Return to The Future of Content Metadata….