Sikuli, GUI, programming, MIT, Computer Science and Artificial Intelligence Lab, Rob Miller, computer vision


Programming Visually with Sikuli

For as long as there have been computers, there has been coding. And with coding comes repetition - lots of it. That's always been the basic fact of a programmer's existence, even as computers have become ever more friendly from a user's perspective.

That's where Sikuli comes in. The latest from the User Interface Design Group at MIT's Computer Science and Artificial Intelligence Laboratory, it's a programming tool that has the ability to see like a human being. Not only does it put the graphical user interface (or GUI) in the hands of programmers, but it may one day put programming in the hands of everyday computer users.

Sikuli stemmed from the research of Associate Professor Rob Miller, Ph.D. student Tsung-Hsiang Chang, and University of Maryland post-doctoral researcher Tom Yeh. It's a software agent that allows one to quickly automate just about any task - so long as there's a GUI involved. Sikuli enables the programming of tasks through a combination of screenshots and simple commands.

The key to Sikuli's appeal is how intuitive it is, something that has rarely if ever been true of programming before. Sikuli users can script what look like function calls, except with screenshots between the parentheses instead of code. This type of interface allows for use by beginners and seasoned programmers alike.

At this point in its development, more involved Sikuli use requires some understanding of Python. But a streamlined, novice-friendly Sikuli could one day put programming into the hands of the average computer user. It would mean a sort of democratization of computing, and would have far-reaching cultural implications.

"You can look at it as an augmentation of human capability," Miller observes. "Which is pretty exciting, because we're not really getting much smarter biologically. I think we need to find ways to make ourselves smarter technologically."

Rob Miller (Massachusetts Institute of Technology)
Tsung-Hsiang Chang (Massachusetts Institute of Technology)
Tom Yeh (University of Maryland)

Agencies (that have supported the research):
National Science Foundation, Quanta Computer


