The talking (and listening) Gecko: May 2009

Tagging (not parsing)..

Posted by Mohit Labels: nlp, pos tagging parser, volunteer

Part of Speech (POS) Tagging refers to a problem in NLP which requires to tag each word of a sentence in a natural language with an identification mark (related to its function in the grammatical structure of the sentence). For eg. NN for noun, singular, VB for verb, base etc. The first part of the project, which involves the text to bash command conversion will make use of POS tagging (I decide to scrap the usage of parser for now because tagging will provide enough data).

Here are the results of a POS Tagger (Claws), however since it is propreitary I will not be using this (alternatives are Open NLP Tools, Stanford University POS Tagger, Language Tool).

1.Play all songs by coldplay from album viva la vida and all songs by death cab for cutie (by Prateek Maheshwari)

Play_VV0 all_DB songs_NN2 by_II coldplay_NN1 from_II album_NN1 viva_NN1 la_FU
vida_NN1 and_CC all_DB songs_NN2 by_II death_NN1 cab_NN1 for_IF cutie_NN1

2. Find an application that edits photos. (by Prateek Maheshwari)

Find_VV0 an_AT1 application_NN1 that_CST edits_VVZ photos_NN2 ._.

3. Open bits mail. (by Nunna Jaikish)

Open_JJ bits_NN2 mail_NN1 ._.

4. Find TODO.txt in Home (by Brad Taylor)

Find_VV0 TODO.txt_NP1 in_II Home_NN1

5. Open this website related to the Indian history from the browsing history. (by me).

Open_VV0 this_DD1 website_NN1 related_VVN to_II the_AT Indian_JJ history_NN1
from_II the_AT browsing_NN1 history_NN1 ._.

The tagging is not completely accurate, as the "Open" in example 3 is incorrectly tagged as JJ ("adjective") instead of VV.

However, a few observations from the above examples-

1. The application that the user is trying to mention can be reasonably ascertained from the verbs.

2. The arguments for that command can be mined from the nouns or noun phrases (again reasonably).

Wednesday, May 27, 2009 | 0 Comments

Project Name?

Posted by Mohit Labels: gpl, gsoc, open source, opensuse, title, volunteer

Well, it has started, but its still not titled.. and there has been limited progress in that regard--

Some of the few I had thought over were--

1. Vaani (hindi, meaning "Voice")
2. Shrimp (pointless albeit cool)
3. Psittacula (scientific name for parrot :P )
4. Voice-do (suggested by Stephen, my mentor)

Can anyone suggest a better one?

Tuesday, May 26, 2009 | 5 Comments

Volunteers needed!

Posted by Mohit Labels: community, open source, volunteer

Hi, as I mentioned in my first post, I need some samples of how people would give commands to their desktops in a natural language, for now just English. I guess it'll be easier if I could give some sort of a questionnaire. In it are a few desktop activities that one usually performs on a desktop, if you were to do instruct your computer to do these activities how would you formulate the sentence? You can take specific instances to create samples, for eg. for the first question you can prepare at least three sample commands, one for playing music by artist ("Coldplay" for instance), one for an album and one for a genre.

It'll be a most valued contribution, as it will help improve the strategy used in analyzing natural language commands. Please email your results to mohit.verma.in@gmail.com with subject "gsoc help" or post your samples as comments here.

Questionnaire:

1. You want to play music (you may mention the media player or not), based on artist, genre, album etc.

2. You want to install/upgrade a software.

3. You want to initiate an IM conversation with a friend (you may mention the protocol or not, for eg. yahoo chat).

4. You want to locate a file/folder in your home directory.

5. You want to find an application which does a particular task (probably mentioned in its description).

6. You want to browse a website (possibly present in the history or in bookmarks).

(If you feel there are other common desktop activities, then please mention them as well in the sample submitted).

Sunday, May 24, 2009 | 1 Comments

System Design # 1

Posted by Mohit Labels: Design, nlp, open source, opensuse, parser, Shrimp, sphinx, stanford university

Here's the basic layout of the system.

The layer will accept inputs in two forms-

1. Regular text
2. Speech

The speech input will first have to be converted to text
using a speech recognition system called Sphinx. Since this
conversion is usually error prone, the text will be enhanced
using knowledge of the system.

After this, it can be handled in a similar way to regular text input.

At the first phase, a parser will generate a tree and tags for
a given user command. For this, a statistical parser written in
Java at the Stanford University Natural Language Group will be used, it can be checked here.

After this, the analyzer will try to determine the kind of action the user wants to perform and then the application specific interpreter will try to find the arguments in the natural language text, for example if a user wants to play some music, the title, artist, genre etc will probably be mentioned in the text, that will have to be mined.

Several times, the system will not be completely sure of the result generated, hence user recommendation will be taken to improve the accuracy.

Sunday, May 24, 2009 | 1 Comments

The summer begins (officially)

Posted by Mohit Labels: community, nlp, open source, opensuse, sphinx

Hi all.. Hmm well, through this blog I hope to communicate to everyone interested, the progress of my GSoC Project, still untitlted (you could help there) project. The idea is to make a functional "Natural Language + Voice User Interface for openSUSE Desktop" (the abstract can be viewed here).

So here I am with a replaced motherboard and upgraded RAM in my laptop, all gung-ho to start my first dream project.. And at this very moment, I need some help :P

The project is about making a software layer which lets a computer understand a user's commands in a natural language (through text or speech), which is why I'd like some people to give in a sample of their natural language commands. To give a clue, if I could give commands to your computer in English, I'd say things like this --

1. "Privately message Nihar Joshi on Google Talk"
2. "Enqueue all the Coldplay songs in the player"

I do have a basic strategy in place to understand such commands, the first step being a parser, after which an application-specific analyzer will convert such a command to a bash command (which can be executed directly on a Linux platform). However, since commands in a natural language like English can be of a very wide variety and my strategy might be suffering from a lack of perspective, I'd like to take inputs from several sources to test my strategy.

Imagine that you do have such a system working on your computer, what kind of commands would you like to give to your computer in English? Please don't restrict yourself to my examples and try to think of all routine desktop activities you perform on your computer. Should you prefer anonymity you can mail me your list at mohit.verma.in@gmail.com with subject "GSoC help".

Cheers and looking forward to your support, (support open source, thats what good guys do :P )

Saturday, May 23, 2009 | 1 Comments

The talking (and listening) Gecko

Tagging (not parsing)..

Project Name?

Volunteers needed!

System Design # 1

The summer begins (officially)

The (other) Blog

Categories

Archives

Facebook Badge

Followers