This articles outlines a process for the automatic extraction of metadata from a given podcast. The system called "Zempod" can identify segments of speech and segments of music within a podcast. It then has workflows designed to collect metadata from each of the segment forms: music segments have a unique fingerprint which can be matched with a record in a database and therefore used to identify the track and provide the metadata about the track. Similarly, speech segments can be put through some speech recognition to produce a transcript and keywords. These methods provide searchable podcasts which can be related to other objects on the internet through Semantic markup.
What I found particularly interesting in this article was that Celma and Raimond attempt to use Semantic markup in order to solve a problem, as opposed to trying it out. Web 3.0 philosophies are incorporated specifically to provide a solutions to the problem of not knowing what a podcast is about. Some information is provided about podcasts such as a brief title and/or description but this comes back to an early thought, how can we trust what's written?