Thursday, July 25, 2013

Idea for a Class Exercise: Reverse Engineering Annoying and Embedded AI Software

I initially blamed age for a sudden increase in the number of "typos" that were actually correctly spelled words, but out of context. Then I started suspecting that stupid (or rather annoying!) word completion algorithms was responsible, and recently I've started catching it. My favorite artificial intelligence (AI) gaffs are

"wild west" ended up as "mild west" (the 'w' in "wild" changed to 'm' mid-way through the phrase though I didn't correct it until I was done ripping off text)

"dinner party" ==> "sinner party"

"cutting and pasting" ==> "cussing and pasting"

These mistakes lead to funny results -- in most cases the auto-correction on various platforms is just plain screwy, and in other cases of course it works well, but these are generally cases when I've completed the word, its flagged as an incorrect spelling with red underline and there is a change. It seems like some software is just trying to be TOO proactive and TOO helpful, perhaps like some people are some of the time.

I would prefer slightly more patient AI ... AI without the emotional needs.

Having my AI students do a thought experiment, reverse engineering such software, will be a neat exercise in my AI class, I think, because while this software is really, really annoying, and just plain messed up, it (or its developers) is a lot more interesting and instructive in its neediness than I first appreciated!



Sunday, July 21, 2013

Playing with Pictures

I originally posted the following on my now extinguished Woodpress blog in
August 2011, and used the story in my Fall 2011 AI class as an example of problem solving that we'd eventually like computers to do.

---

During a driving tour of the Midwest in July that Pat and I made in our new Honda Fit, I was continually posting pictures on Facebook in a vacation album. Less than midway through I exhausted the 200 picture limit per album and was tempted to start new albums, one for each day or two, but a 200 picture limit is plenty I thought, and I liked the idea of using the constraint to prune out all but my favorites and pictures that were not thematically redundant; I also constrained myself to keep those already receiving a thumbs up or comment, etc.  After the trip I was invited onto Google+ by Russ and Mary Lou, and Google (Picasa) has no album limit that I can tell, so there are 750+ pictures there!  (https://picasaweb.google.com/106374191437655932029/SummerVacation2011 ) Talk about a lack of discipline.

A very cool functionality is that I can locate these pictures on Google maps, using any of the modalities – maps, satellite view, street view, or Earth.  In locations with sufficient resolution I could place the photo right on the spot I was standing when I took the picture, though in some cases there appears to be some drift from the location I placed it when I look back. There didn’t appear to be a way to specify orientation of the photo – what direction I was facing when I took it, but I am guessing someone will do that in the near future. In any case, it’s very cool.

Since I was placing the pictures a couple of weeks after the trip, there were different heuristics I used to place them – sometimes it was straightforward – a particular highway junction, or something otherwise named like a school or a cemetery or a mountain peak on Google maps. The order in which pictures were taken offered some constraints, since having located one picture narrowed the possible locations for the next, but frankly, I have a good memory for such things as events and sequences. In one case though, even with some known restricted area stemming from sequencing information, I was trying to locate a picture in the tiny town of Scribner, Nebraska, but I saw no way to identify the precise location of a picture I had taken of an old church or the like, with a steeple (attached below). I was in the satellite view, maximum resolution, struggling to see some identifying visual cues, but the steeple itself was impossible to make out from a direct overhead view, …, and then I saw the SHADOW of the steeple in the satellite image !!! Amazing!! I’m attaching that image to this note. That was just neat.

I had so much fun that I created a couple of other albums on Picasa. One of these was from my trip to Copenhagen in 2009, while at NSF (https://picasaweb.google.com/106374191437655932029/Copenhagen2009 ; http://doughfisher.blogspot.com/2013/07/copenhagen-2009.html). I had taken a redeye from Dulles in DC to Copenhagen, arriving about 7 or 8 AM the day BEFORE the conference would start in a port city some ways away. I never sleep on planes, not even redeye flights, so I was pretty trashed when I arrived, but how often do you I go to Copenhagen! (OK, I’d been there is 2008 as well). So before taking the train to the Helsingor, I walked around Copenhagen. Even though it had been more than two years previously, I remembered the sequence well and was able to place the pictures in the same way I had for our recent vacation, including confirming the exact location of a picture of a statue from its shadow! Statue and shadow attached.

What was even more striking in this case than our recent vacation was the affect of reliving that walk as I placed the pictures – I saw the images; remembered roughly the walking sequence, using cues from Google views to fill in a few gaps and otherwise reduce uncertainties. A good friend had died not long before that trip, and that had been on my mind during the stroll of Copenhagen, and a hint of that emotion in the form of reflection came back.

I had so much fun doing the picture placement that I’m trying to think about how to formalize the activity as a project for my artificial intelligence class this coming semester. There is also a good human-computer interface problem here – in many cases I could only approximately place the pictures (e.g., highway shots on our driving vacation) and representing the variable uncertainty associated with physical location would be desirable.

I am writing this note on a whim – I watched the last installment of Ken Burns’ National Parks, and was remembering my Boy Scout days of backpacking through places like Tuolumne Meadows in Yosemite and Mt Whitney in Sequoia National Park. I’ve got some great stories of the Colorado river trip and bears raiding our campground in Yosemite, but I think my favorite trip was Kings Canyon, which probably started outside Mammoth, but in any case, we hiked among some incredible lakes, most above timberline: Thousand islands, Emerald, Ruby, Garnet and Shadow lakes. There is nothing like hiking above timberline along a ridge when a wind hits your back -- really an amazing feeling.

My crispest memories are probably of Garnet Lake – it was amazing when I was a boy scout, and I returned in graduate school with friends Rogers and Pete. In any case, I looked for some pictures on the Web and found these: http://www-personal.umich.edu/~jensenl/visuals/album/2006/thousand/ . And many others of course. Scroll down – there are some very nice pictures here and I remember more than a few scenes – I could point out the little island in Garnet Lake I swam too and almost died (joking, sort of) – it was freezing! The places we cooked and washed. And I can probably place some of the pictures on the trail map.

This recent play with pictures suggests some possibilities with immersion into virtual worlds – the technology is pretty primitive now, but because its piggy backing on memory of real experience, the affect is quite powerful.

Scribner Steeple (above)
Scribner Google Map image, with shadow! (below)


Copenhagen Statue (above)
Copenhagen Google Map image, with shadow! (below)

Monday, July 15, 2013

Reusing other Instructor's Assignments ... not! (or ?)

I am in the Educational Advances in Artificial Intelligence (EAAI-13), and we just concluded a session on educational repositories, particularly online repositories of homework assignments. Repositories of educational resources is a topic near and dear to my heart, but at least in the case of repositories of homework assignments, there appears to be no, little, or at best weak anecdotal evidence that assignments are being reused. At a minimum, don't we want repositories to be "instrumented,", like my (and everyone's) YouTube channel(s), so I can see downloads, likes, dislikes, and more sophisticated measures of usage that are specific to homework assignments?

Its hard to know if a homework assignment that has been posted in a educational repository is actually used by another instructor, unless an instructor who has used it, gets back to me and tells me so. There is some work in thinking about how to do this. But there is also low hanging fruit. First, we can measure downloads, but beyond this, as an educational community can take a small step towards a scholarly culture surrounding education materials by designing licenses specific to this kind of content.

For example, a license for usage of educational content could require that the material can be used by others (e.g., following any of the principles of creative commons licenses: http://creativecommons.org/), but additionally require that the user report back on the usage to the author (typically, the copyright holder), whether the use is as is, or derivative.

I think that this would be an incredible help to evaluating the extent and manner of use of educational material, going well beyond measuring downloads, and ultimately of evaluating the utility of educational materials to the educational community.

Let's ask people about their use, through a license that requires report back (and nothing else), rather than simply depending of the ability of inference by machine methods.