InnerSelf

By Chris Lovett

While driving home from work in 1996 I had the following vision of an amazing hand held device called “InnerSelf”. I shared this with a bunch of Microsoft executives when I first joined Microsoft in 1997 and they must have all thought I was crazy because I got no response 😊 I updated the language a bit but only to add “web service” since that is now accepted terminology. It’s not a big jump to replace the language that says “rule based” with “LLM” and you can see many people are trying to build this idea right now.

The idea was centered around three concepts:

Totally real speech and voice interface – making a GUI irrelevant.
Indestructible design - it is heavily padded so a kid could take it to school and practically kick it around the playground.
It becomes an extension of your personal thought life.

It is a soft squishy cube about 4.5 inches across, like this one only with rounded corners. It has big powerful speakers for awesome sound quality and an array of microphones for input. It has a matrix of transducers woven into the fabric for detecting pressure, heat, moisture, and also contains internal sensors for detecting acceleration. The device is water resistant and contains massive amounts of indestructible permanent storage and has built in high bandwidth wireless connectivity. It also has a built in GPS system. Underneath, it has a headset jack for more confined communication, and a power plug for charging.

Purpose

Its purpose in life is to be a personal assistant, counselor, agent and fun toy. It understands voice, and can produce aesthetic speech and sounds.

Personal Assistant

Security – first of all it only responds to the voice of its owner. The owner selects a name for it and addresses it this way. The data inside is locked up maybe even embedded in a kind of resin to avoid someone stealing it to break it open and hack into your personal information. (Chip makers use some techniques like this to avoid reverse engineering their designs).

For the purpose of this paper, let’s call it “Computer” – boring I know.

Time Management – to help you wake up, make meetings on time, go to bed, a kid could easily teach it his school class schedule, music practice, etc.

The night before: “Computer, can you wake me up at six thirty?”
Computer: “Ok”.
6:30am “Good morning …. (your name)…” followed by some wake up sound you’ve downloaded into it from the web – let’s say it is the sound of cockatoos screeching in Australia.
6:32am You: “Ok, I’m getting up”. The computer knows that it’s a weekday and getting up is more important than on the weekend and has learned that you typically do not get up and that the best way to get you up is to ignore you and keep on playing the annoying sound.
6:50am You pounce out of bed, grab the box and because you are squeezing it it knows you’re up.
Computer: “Ok, ok” and shuts off the music…. Pause … then a warm greeting, slightly tongue in cheek, the computer says “Its good to see you up and about this morning”.
7:50am “Computer, my first class is math right ?”
Computer: “yep”.
12:12am You get invited to a party on the weekend by some friends at school. You want to go, so you check your schedule. “Computer, I’m going to Matt’s place Sunday afternoon”. You don’t ask for permission – you inform the computer so that it can help you in case there’s a conflict.
Computer: “There’s a conflict at home do you want me to work it out ?”.
You: “Later”. You smile to your friend in a way that indicates you’ll be there. The phrase “at home” was critical information from the computer to help you make the decision to be there. You remember now that you had a prior commitment to go hiking with your Dad, and you know your Dad is flexible on this. You don’t even need to ask your computer in this case.
Since you answered “Later”, the computer starts working on communicating with your Dad to let him know the hike is off. The computer will inform you about its progress on this front “Later”. This means after all your meetings, or after school in this case.
The computer is wired so it sends an email to your Dads InnerSelf containing a structured scheduling message which it uses to update it’s calendar. It returns a confirmation message saying it got it.
3:30: The computer detects you are on the bus because it’s so noisy and then detects when you get off the bus because it becomes quiet, so it speaks, “Dad got your message”. This prompts you to think about what you are going to say over dinner. You answer “Thanks”, not realizing the computer doesn’t even know you thanked it, since you didn’t prefix it with “Computer”.
6:30pm: Dad “So, what’s happening on Sunday?” Your response is prepared.

Several important things come up in this scenario:

For prolonged use over several years, less is more. Absolutely efficient minimal communication makes for a lasting relationship between owner and computer.
Is the computer really learning? No, it is just doing what you told it to do previously. Also it may be able to download new programming from the InnerSelf Web Service which simulates intelligence by trying fresh ideas on how to solve problems. This may be how it found the right way to wake you up in the morning. The central web service is where the psychologists are employed to make the Inner Self experience semi-intelligent.
Colloquial language – how does it learn and understand this. Again, training from the user, and central web dictionaries. For example, you may respond to the computer with a new colloquial phrase it has never heard. Instead of bothering you on the spot to explain yourself, it first consults the central web dictionary to see if it can find an answer. There may even be real people on call behind the curtain so to speak, to help resolve this one.
Now you see why massive storage is a requirement!!

Diary

A natural progression from day to day activities is to extend this and keep a diary of thoughts and feelings, ambitions and complaints. Diary entries are prefixed with a special prefix:

“Log: well, today was challenging and exhilarating. I really liked getting up in front of class and making everyone laugh. It was so scary, but when they all laughed at my jokes I felt on top of the world. Maybe I’ll be a comedian one day”.
Now the computer also envelopes this with date/time stamp and enough context to understand what you were talking about.

You can replay earlier diary entries:

“Computer, what did I think about Jeff two years ago?”

Jeff did something to offend you, and you want to think about Jeff for a while. The computer looks up a diary entry around the time you entered your first impressions of someone new that you met called Jeff, and it plays the recording. It was saved as text, but played in a simulation of your own voice.

The computer helps you remember that Jeff has stuck with you as a friend through the years even when you have not treated him particularly well on some occasions. You decide to forgive Jeff and move on.

This alone over several years becomes a massively powerful database of thoughts and ideas that help shape your life. The absolute ease of input is what makes this diary work for people where all other diary systems have failed. People forget so much – especially in this fast paced world we live in today.

Again, massive, almost infinite storage that is virtually indestructible is a requirement.

Counsellor

Clearly, it is not a huge leap to connect the diary, and daily schedule with the ability to solve human problems. The key again is downloadable human intelligence working on your behalf, as follows:

You “Computer: Susan gets on my nerves soooo much !! Why does she always put me down and make me feel so stupid like that?”
Yikes, now the computer is being asked to guide a human being through the most important aspect of life – relationships.
Here the computer draws upon a rule based system of knowledge about how to deal with people, the owner can configure which source of knowledge you would like it to use. For example, a religious person would give it the Bible. A Tony Robbins fan would point the computer at all his books. Tony Robbins may even provide a web service that connects to this device.
The computer is trained to listen, empathise and respond abstractly. Imagine the following: Computer: “Sounds like Susan has hurt you like this before.”
You: “Yeah, I feel it’s time to do something to get back at her, got any ideas?”
At this point, the computer has been asked for feedback, so it answers with a quote from the literature you have pointed it to previously, from the Bible it might say “Well, a good rule to live by is to love your neighbor as yourself”.
You: “Oh, who said that?”
Computer: “Moses, about 3 ½ thousand years ago”.
At this point it could turn into a history lesson and you could forget all about your troubles with Susan, or you could go off on a tangent and ask about career advice.
To end the communication you say something like “Ok, thanks”.

Agent

With wireless capabilities and back end web services supporting this device there are all sorts of opportunities to get the software to do real work on your behalf. We have already discussed several ways that this can occur, but lets look at a few more scenarios.

“Computer, I need a vacation!!”
Computer: “where would you like to go, what would you like to do, how long, when do you want to go, how much would you like to spend, do you want to meet new people? etc, etc.

Then over the course of the next week the computer, with help from back-end web services, scours the web for special deals and comes up with ideas, tuning it’s search as it goes based on the feedback you give it.

Computer: “You can go to Maui for just $439”
You: “Nah, too hot”.
This now limits the search down and so forth.

Another Agent scenario is just an ongoing background ear-to-the-ground kind of service based on your own personal interests. For example, I happen to be interested in Australia, but it is extremely hard to find anything meaningful about Australia in American news. So the Agent goes to work for me and reports back what it found.

You: “Computer: What’s new in Australia?”
Computer: “I have information on Sports, Business, Weather, and People”
You: “Tell me about Business and the Olympics”
Computer: The computer reports business news relating to the Olympics. “They are building a new airport in Brisbane”.
You: “What? Tell me more…”
(This is where a voice interface gets really interesting, being able to drill in on stuff, without having to move and click a mouse!!)
Later on, “Computer, find me a Brisbane public official so I can send a complaint about this development”.

Again the computer acts as agent and helps you communicate your concerns to the city planning department. Perhaps you thought of something they didn’t think of, like the airport might pollute the water supply of the last remaining habitat of the endangered Birdwing Butterfly.

Combining your interests, with agents (or web services) that help you find interesting information and communicate efficiently with the right people at the right time results in a powerful combination. This sort of thing done on a global scale could have profound effects. For example, it would make corruption in government much harder to get away with because a lot more eyes and ears may be “actively” watching what they are doing – not just watching what is designed to be watched on TV, but actively probing and looking at things from different angles, in their own time, based on their own interests.

Fun Toy

The box includes some fun behaviors. For example, you drop it and it says “ouch”. It calculates a drop by using the movement transducers and the pressure transducers. Sudden acceleration combined with one side receiving pressure indicates it was either thrown, or dropped.

Detecting moisture it may put on an under-water sound effect saying indignantly “Hey, I’m drowning here!!” This warns the user to do something to save their precious data, but conveyed in a way that gives it life-like qualities.

It may download random human created toyish effects designed to use its powerful array of transducers in new and creative ways, giving it an continual freshness. Some sort of smell transducer could be interesting too.

This toyness is really just an icebreaker between owner and device more than anything else – to help the owner place trust in the device.

Conclusion

A totally real voice interface, combined with lots of small effortless data inputs over a long period of time, and massive amounts of storage results in a device that will become indispensable to its owner.

Clearly there are technical challenges. However, storage is continuing to increase at phenomenal rates, and despite the threat of reaching physical limits, new technologies like holographic storage seem promising. Also, with speech and voice, it is interesting to me that when this threshold is reached, a GUI for all this functionality is NOT needed, resulting in a device that has the potential to be a lot more indestructible than the delicate handheld devices we have today.

There are also human challenges to overcome – like trust. Do I really trust this device not to leak information about me back to some central authority? What if I start talking crazy about guns? Does it notify the police? Well if you’re in the High school cafeteria at the time, maybe it should discretely notify the school counselor!

Anyway, this device is still a ways out, but technology is rapidly approaching that would make this possible, the result if executed will could be world changing.