[Home]
[Full version]
Research team develops systems that process and understand spoken language, especially Basque
Mar 10 ,Technology
A research team drawn from the Department of Systems and Automation Engineering of the Polytechnic University School and from the Faculty of Informatics at the Donostia-San Sebastián campus of the University of the Basque Country (UPV/EHU) and led by lecturer Miren Karmele Lopez de Ipiña, is developing systems that process and understand spoken language and automatically obtain information particularly from Basque radio and television.
Carrying out a search in the net for written documents is an easy task – the word is simply introduced in to the search tool. Nevertheless, these searches do not work with the spoken word or with audio archives, unless these have an accompanying written explanation.
Recognising spoken language and converting it into text is not easy. The words cannot be easily distinguished from each other, intonation has to be taken into consideration and, besides, physical signal noise is also an obstacle. Because of all this, there is a huge market in systems that process and understand spoken language, i.e. systems that convert it into written text. Such systems are integrated, mainly, into telephone services such as prior appointment, requests for products, bookings for performances, etc. In any case, there are also other devices, for example, automatic dictation i.e. systems that convert oral text to written on the spot. It is in this latter aspect that the research team at the Department of Systems and Automation Engineering at the UPV/EHU is focusing.
For the spoken process, the system has to be very well practised, i.e. it has to be taught with a training programme known as machine-study. First, television or radio audio files are needed and it is also necessary to have certain reference texts from the mentioned media. The research team at the UPV/EHU, for example, frequently use files from the Gaur Egun and Teleberri slots (Basque Television news, in Basque and Spanish respectively) in order to programme/train the system. It is not necessary to know what is being said word for word; the system has to be able to carry out a resume of what is heard. At the end, the system seeks to comprehend the relation between the words and the sounds.
Once terminated the training/learning process, the system should be capable of understanding what is heard in any programme of Gaur Egun or Teleberri. Although the learning process is very lengthy, once the system interiorises the rules or the information, i.e. suitable reference material, the result is obtained rapidly - in this case, written text from spoken.
Small and big
In reality, the majority of applications of this type on the market are aimed at the “big” languages, above all English. In any case, the research team at the Polytechnic University School in Donostia-San Sebastián, together with the IXA team, GTTS and the Computational Intelligence team from UPV/EHU, are working with the Basque language - Euskera. The main difference between “small” languages and “big” ones is the number of reference data. These types of systems for English have an impressive amount of data while reference material for Basque, on the other hand, is considerably less. Given all this, the research team is focusing on developing new techniques to take better advantage of these minimum data and to use them with greater precision.
In order to obtain greater precision, mathematic equations are used. What is involved is the location of the most important characteristics that provide suitable information for the audio files. It is not easy to carry out this selection, distinguishing suitable data from unsuitable information. Normally, the UPV/EHU research team takes frequency and intonation into consideration in order to classify all the information gathered (for example, to differentiate a question from a statement, etc).
These systems depend a lot on the language and each language has its own system. The UPV/EHU research is not only working with Euskera, but also with Spanish and French. When studying the Teleberri and Infozazpi programmes, amongst others, they have two goals: on the one hand, comprehend Spanish and French — as well as Basque — and, on the other, detect the similarities within these systems between Euskera and the other two languages, in order to train the systems in Basque even more.
As regards this, the UPV/EHU research team is currently undertaking trials to develop a system that is valid for more than one language. This is the precisely the challenge for the future: to develop a system that is capable of understanding Basque, Spanish and French.
Source: Elhuyar Fundazioa
Related stories:
Tartalo the robot is knocking on your door
A research team from the University of the Basque Country, led by Basilio Sierra, is devising a robot that can get around by itself. Tartalo is able to identify different places and ask permission before going through a doorway.
Precision control of movement in robots
A research team from the Department of Electricity and Electronics at the University of the Basque Country’s Faculty of Science and Technology in Leioa, Spain, led by Victor Etxebarria, is investigating the characteristics of various types of materials for their use in the generation and measurement of precise movements.
Magnetic atoms of gold, silver and copper have been obtained
An international team led by Physics and Chemistry teams from the Faculty of Science and Technology at the University of the Basque Country (UPV/EHU) and directed by Professor Jose Javier Saiz Garitaonandia, has achieved, by means of a controlled chemical process, that atoms of gold, silver and copper - intrinsically non-magnetic (not attracted to a magnet) - become magnetic. The article has been published in the February issue of
Nanoletters (Vol.8, No. 2, 661-667 (2008)).
A mysterious change in the wave properties of electrons
The electrons of a perfect metallic surface move like free waves in a plane. Nevertheless, if atomic barriers are inserted, this may restrict their movement in one dimension, forming stationary waves such as those on the water surface in a bucket.
The stationary or free behaviour of
electron waves is, nevertheless, still something very intriguing, given that the barriers of
atoms are very close to each other, there is no confinement, and that the electron recovers its free movement, exactly as was discovered some years ago by the Nanophysics Laboratory research team led by Enrique Ortega at the Donostia-San Sebastian campus of the University of the Basque Country.
NASA Selects Science Teams for Astrobiology Institute
(PhysOrg.com) -- NASA has awarded five-year grants, averaging $7 million each, to 10 research teams from across the country, including two from NASA's Jet Propulsion Laboratory in Pasadena, to study the origins, evolution, distribution and future of life in the universe.
This is your grid on brains
(PhysOrg.com) -- Managing power networks in the future may involve a little more brain power than it does today, if researchers at Missouri University of Science and Technology succeed in a new project that involves literally tapping brain cells grown on networks of electrodes.
What HIV needs: Identification of human factors may yield novel therapeutic targets for HIV
The Salk Institute for Biological Studies and Burnham Institute for Medical Research today announced 295 host cell factors that are involved in human immunodeficiency virus (HIV) infection. The study, published in the Oct. 3 issue of
Cell, could lead to the development of a new class of HIV therapeutics aimed at disrupting the human-HIV interactions that lead to viral infection.
Research team discovers brain pathway responsible for obesity
University of Wisconsin-Madison researchers, for the first time, have found a messaging system in the brain that directly affects food intake and body weight.
[Home]
[Full version]