Haptic SoundScape mGIS Interface

Wednesday, October 13, 2010

Selection of audio file with component identification

In the current version of mGIS interface, which audio file is to play for a layer of components, is defined inside the program. After the last discussion with our Geography partners, seems like all of the students should have an identical view of a single map data. So individualization of mGIS interface may not be an issue. But if we want the educators able to select audio files for different layers, we should provide some facilities for doing this.

Using style sheet may be an option. I can think of it as a table lookup process. Here I am describing how we can incorporate style sheet:

We already have mouse motion event listener registered with the map components. Whenever the mouse pointer points a pixel, this listener triggers the event and calls the event performed method.

eventPerformed(mouseMotion event){

String layerName = identifyMapLayer(event.getX, event.getY);
String audioFileName = style.Layer(layerName);

}

Here the identifyMapLayer returns the layer name and the style sheet then look up the audio file name for the corresponding layer. We know how to define the identifyMapLayer method. Now we need to find out a way to define a style sheet.

In this way the style sheet will have no dependency on arcGIS components and events. If we think about this from the modular point of view, this style sheet can be built independently without having much integration cost. I hope the educators of mGIS interface will be able to select or change the audio files for different layers.

Sunday, October 10, 2010

Style sheet API thoughts

I'm looking through Keith Albin's style sheet project (part of his B.S. thesis project) and starting to think about how to turn it into a useful library for the current project and future projects.

In the version Keith wrote, the interface to a soundscape map is through the map data structure. That structure, called a "geoshape", was conceptually similar to a shapes in an ArcGIS shapefile (because ultimately that's where the data came from, though it went through a couple of transformations on its way into the program), but it was a new Java data structure. The style sheet processor worked as follows:

Parse the style sheet(s). There could be more than one because it provided "cascading" like CSS, but somewhat more powerful (e.g., it provided variables with some simple string manipulation, to make style sheets much more compact and modular).
For each geoshape
look up the attributes for the geoshape by class and identifier
attach attributes to the geoshape data structure

I'm pretty sure that's not the interface we want, because we don't want the GIS software to have to put all the map data into a custom data structure. I think it will be pretty straightforward to instead allow the GIS software to look up attributes of a shape as needed. So now I'm thinking about what that API should be. I'd like to keep it as independent of the particular GIS data structures as possible, but still convenient to use.

My initial thought is that it might work something like this:

Instantiate a style sheet object, giving it a name or search path for a style sheet specification file. (I.e., the constructor of a style sheet object will require information on where to find a style sheet).
Further interactions will take place from within event listeners, or code called by event listeners. If we were drawing the map ourselves, they might also take place from methods that paint the map on the screen. A typical call might be something like



  AttributeDescriptor  styles.getAttribute (

            eventKind,   // Should this be a string? 

            shapeLayerName, // Ditto ... string?  Probably fast enough

            shapeID // String? 

                                              );

The eventKind parameter might be redundant if the AttributeDescriptor is itself a table for looking up particular attributes, e.g., a simple string -> string table like this.



{ (color: #44c819), (on-entry-sound:  mp3#carsCrashing.mp3), 

  (on-exit-sound: midi(some-textual-rep-of-midi-spec)) }

My current thinking, though, is that there isn't much advantage in retrieving all the attributes for a particular object, and then picking out the particular attributes of interest. It seems that in almost all cases (except for painting the map on the screen) we need only one attribute from an object, such as "what sound do I play when the cursor enters the object called 'Yellowstone Lake' in the 'Bodies of Water' layer?"

So what is the type of the thing returned by the style sheet object in answer to that question? Consider that some types make sense (e.g., mp3 file and midi spec might make sense for a sound) and others don't (hexadecimal codes for colors don't make sense for a sound). A simple answer would be to always return a string, and let the mGIS software be responsible for making sense of the string. However, besides making the mGIS software a bit more complex, I think there might be some performance issues with that. For example, if the mGIS software doesn't learn the name of an mp3 file to play until the cursor reaches an object requiring that sound clip, it cannot preload the mp3 for immediate playback. This makes me think that there should probably be a somewhat more complex instantiation of the style-sheet object with some "hooks" to allow much more specific attribute objects to be returned when needed.

Although I haven't thought it through completely, I'm thinking of something like a framework, where the stylesheet software (or a layer just above the core stylesheet software) allows plugging in objects with custom methods for different kinds of interaction. Think of them as being like listeners --- the mGIS software would provide a set of listener objects when instantiating the style sheet package, and the style sheet package would invoke those listeners when the mGIS software reported an event on a particular object. So instead of styles.getAttribute above, it might be something like



styles.reactToEvent( eventKind, shapeLayerName, shapeID )

which would find the correct listener in the style sheet internal structure and call it.

Does this make sense? Thoughts on how to make it better or easier to use?

Tuesday, August 31, 2010

Introducing ArcGIS Menu-bar

So far the commands of the mGIS control panel were not accessible for a blind user. Only standard Java buttons were made accessible by adding mouse motion listener on them. But other than standard Java buttons, there were couple of ArcGIS built-in commands, which can't be customized to add mouse listener.

I was searching web and found one article that says: "JAWS is the only popular screen reader that works with Java applications....Allow users to press the “F10” key to move focus to the Java application window menu bar (“File,” “Edit,” “View”) at the top of the application window. When a menu bar menu is open, allow users to press the right arrow and left arrow keys to move between and open adjacent menu bar menus."

http://www.lawrence-najjar.com/papers/Accessible_Java_application_user_interface_design_guidelines.html

That makes me more interested about Jaws and menu-bars of ArcGIS. Now I am implementing a simple program that uses ArcGIS menu-bar for ArcGIS built-in commands and standard Java MenuBar. If this this program works well then I will integrate this with our current version of mGIS.

Fortunately now Jaws can enunciate the names of ArcGIS commands whenever the mouse is hovering on the menu items. I am also playing a little bit with Java Swing Borderlayout to implement corner based popup menus as suggested by our Geography partners..

Wednesday, August 25, 2010

Map description with the startup of soundscape feature

For giving the user an overview of the map we liked to introduce a map description feature. This feature requires a "Read_me.txt" file stored in the same directory as the map (.mxd file). The "Read Me" file contains a couple of sentences about the map, describing an overview idea. Whenever the "identify" button has been clicked, before starting identifying map's elements this will start reading out the contents of "Read me" file. This should be a good start with the soundscape feature for an audience.

For implementing this feature I got some issues. If there are more than one line (which is obvious for an overview) the program reads out only the last line. It is because of the time difference between instruction loading and word pronunciation. It obviously takes longer to enunciate. So what I did, I intentionally set delay in loading Read me file contents. Now after loading each line of the Read me file it waits for 5000 ms. By this time it gets enough time to read out the line. Here is a trade off with the delay and the length of the line. If the line cannot be enunciated within this period/delay, it truncated the previous line to read out the next one. For the current contents of the Washington and Yellowstone "Read me" files 5000 ms is a good amount of delay.

For a better solution of reading out multiple lines or paragraph I searched for couple of other text to speech software like Jaws and NaturalReader.

Unfortunately Jaws doesn't read out lines like sentences! I mean, it reads out only up-to a newline character and then waits for the audience to press down-arrow or enter. Although It respects punctuations like comas and full stops so does our current tts engine. But Jaws doesn't read out a paragraph at a stretch that we are doing now.

NaturalReader is a different text-to-speech commercial software from NaturalSoft.
http://www.naturalreaders.com/?gclid=CK37p-iu1aMCFRNSgwodCHuNvw

It loads the text in a different fashion. It requires user to select a portion of the paragraph then start pronunciation.

In the current version of mGIS the program reads a single line from the "Read me" file, enunciate it and then load the next line. As an alternative of that may be we can try loading the whole text at a time then start pronunciation like NaturalReader.

Monday, July 19, 2010

Wacom tablet screen mapping

As I was using the Wacom tablet without installing the driver, it was working on its default mode. The default mood of using mouse is same as normal mouse and the default mode of using pen is screen mapping. The pen mode screen mapping sets the screen cursor in a way that wherever I put the tool(pen/mouse) the cursor will jump to its corresponding point on screen. This is also known as absolute positioning.

This was the problem we encountered while testing mGIS for the first time with Jake. Now I have downloaded and installed the Wacom Intuos3 driver from the Wacom site.

http://www.wacom.com/downloads/drivers.php

Here is the link for Wacom Intuos3 manual:

http://www.mannlib.cornell.edu/files/documents/Wacon_PTZ630_UsersManual.pdf

Now the tablet is working really nice with the pen/mouse with screen mapping mood.

Friday, July 16, 2010

Meeting in Seattle

Our first meeting with Jake Cook in Seattle was a clear milestone for mGIS project indeed. Jake brought to light couple of things on which we didn't pay attention before.

1. Tablet-pen seems more preferable rather than mouse. Although we could not use the tablet in a tablet mood. Jake compared the pen as the cane of a blind person.

2. Alerting user when the mouse is going outside the map panel (ex. at the edge of the screen, on the menu barn etc.) was an issue. Using Jaws may help us.

3. Even the basic map of Yellowstone park seemed much more complicated for Jake. It can be of several reasons:
a. Jake didn't have an overall idea of the map.
b. The map has couple of buffered rivers (13 rivers) with a number of bends. He felt better when we zoomed in and he found only one or two wide rivers on screen. We need to think about the simplification of the rivers. Increasing the amount of buffer can help us partially.
c. We didn't set up a task-list for Jake. May be Amy could help us doing so. Jake was wandering what he needed to do. He was trying to understand the direction of a river or which way it flows. For one river he found the correct direction.

4. We also tested the multi-touch tablet. Although this one was not big enough for the screen it worked well for Jake. He was using his right hand index finger.

Thursday, June 24, 2010

Headsets or not?

In our Monday meeting, we briefly discussed direction and distance of sound cues. These are tied to the use of headphones, and indirectly to Jake's idea about voice input:

There are different levels of "proximity" information we could provide:

Distance only (no direction) - just with loudness. We could scale loudness based on a realistic function (and we should at least find out what that is), or we could use an artificial function to either increase or decrease the range at which proximate regions become audible. This is further broken into two sub-cases:

Continuous distance scaling. If the loudness of an object differs as a continuous function of distance, then one can judge direction by moving the mouse (maybe ... this may be difficult).
Discrete scaling. The simplest version of this is to have an extra buffer around each object, in which its sound is audible but less loud. With discrete scaling, mouse motion does not reveal the direction to an object unless the boundary of a surrounding buffer is crossed.

Directional. The most accurate directional audio requires headphones and computed with an individual head-related transfer function ( HRTF ), but some left-right directionality can be achieved using a generic model and even with ordinary stereo speakers. We are better at sensing direction of high-pitched sounds than lower pitches (which is why a surround-sound system has more tweeters and mid-range drivers than woofers, and why high-end stereo systems often use a single sub-woofer). High quality directional audio is a good deal more computationally intensive than simply varying loudness, but directional audio is supported program libraries (including for Java) because it is used in some games.

Amy noted that she has never seen a blind person wearing headphones, and she conjectured that blind people might find them objectionable because they "close off" the wearer.

The indirect link between this and Jake's idea about audio input is that the most accurate audio input comes from headset microphones. The built-in microphones in most computers are poor quality and/or have a problem with noise from the computer. It should be possible to provide a good-quality microphone that is not part of a headset, but it could be a challenge to keep it well-positioned relative to a blind user.