METHOD AND SYSTEM FOR VOICE EXCHANGE AND VOICE
DISTRIBUTION
FIELD OF THE INVENTION
This invention relates to the field of packet communications, and more particularly to voice packet communication systems.
BACKGROUND OF THE INVENTION
Many users of on-line services utilize text-based communication systems for the exchange of messages. Two well known text-based communication systems techniques are e-mail, wherein text messages are placed in a central file associated with a destination address, to be downloaded at a later time when the recipient "logs in" and instant messaging, where text is typed and exchanged between computers when a "buddy" address (or group address) is present in an address field. Although it is possible to attach files to the text file for the transfer of non-text formats, including graphic and audio files, this technique is greatly limited. When an audio file is attached, the technique lacks a method for convenient recording, storing, exchanging, responding and listening to voices between one or more parties, independent of whether or not they are logged in to their network.
SUMMARY OF THE INVENTION
The present invention is a system and method for voice exchange and voice distribution utilizing a voice container. Based on states, rules and type of devices provided, voice containers can be stored, transcoded and routed to the appropriate recipients instantaneously or stored for later delivery. The present invention system
and method for voice exchange and voice distribution allows a software agent with a user interface in conjunction with a central server to send, receive and store messages using voice containers. In addition, the present invention for voice exchange and voice distribution provides the ability to store messages both locally and centrally at the server whenever the recipient is not available for a prescribed period of time. Additionally, the present invention allows manual or pre-programmed control of the origination, distribution and listening to these messages, and also offers the options of ringing a pre-configured phone number at the recipient's request for the delivery of the message or forwarding the message to another Internet or voice container enabled device.
BRIEF DESCRIPTION OF THE DRAWINGS
A more complete understanding of the present invention may be obtained from consideration of the following description in conjunction with the drawings in which:
FIG. 1 is a high level functional block diagram of the system for voice exchange and voice distribution;
FIG. 1A is the high level functional block diagram of FIG. 1 including a voice format detection and translation system;
FIG. 2 is a high level overview of the system architecture;
FIG. 3 is an exemplary embodiment of the voice container structure;
FIG. 4 is a high level flow chart for PC to PC and PC to network communications utilizing the system for voice exchange and voice distribution;
FIG. 5 is a high level flow chart for dial in emulation from a telephone utilizing the system for voice exchange and voice distribution;
FIG. 6 is a high level flow chart for spot calling utilizing the method and system for voice exchange and voice distribution;
FIG. 7 is a flow chart of an exemplary embodiment illustrating the method and system with respect to the originator;
FIG. 8 is a flow chart of an exemplary embodiment illustrating the method and system with respect to the central server;
FIG. 9 is a flow chart of an exemplary embodiment illustrating the method and system with respect to the recipient;
FIG. 10 is a flow chart of an exemplary embodiment illustrating the method and system for voice exchange and voice distribution with respect to the originator of a voice spot;
FIG. 11 is a flow chart of an exemplary embodiment illustrating the method and system for voice exchange and voice distribution with respect to the central server for a voice spot;
FIG. 12 is a flow chart of an exemplary embodiment illustrating the method and system for voice exchange and voice distribution with respect to the recipient of a voice spot;
FIG. 13 is a flow chart of an exemplary embodiment illustrating the method and system for voice exchange and voice distribution with respect to the originator and recipient for an anonymous voice communication;
FIG. 14 is a flow chart of an exemplary embodiment illustrating the method and system for voice exchange and voice distribution with respect to the central server for an anonymous voice communication;
FIG. 15 is a flow chart of an exemplary embodiment illustrating the method and system for voice exchange and voice distribution with respect to the central server for emulation through a telephone system;
FIG. 16 is a flow chart of an exemplary embodiment illustrating the method and system for voice exchange and voice distribution with respect to the originator of a voice container with multimedia attachments;
FIG. 17 is a flow chart of an exemplary embodiment illustrating the method and system for voice exchange and voice distribution with respect to the central server for a voice container with multimedia attachments;
FIG. 18 is a flow chart of an exemplary embodiment illustrating the method and system for voice exchange and voice distribution with respect to the recipient of a voice container with multimedia attachments;
FIG. 19 is a flow chart of an exemplary embodiment illustrating the method and system for voice exchange and voice distribution with respect to preparing a voice container without a PC; and,
FIG. 20 is a flow chart of an exemplary embodiment illustrating the method and system for voice exchange and voice distribution with respect to playing a voice container on a non-PC based appliance.
DETAILED DESCRIPTION OF VARIOUS ILLUSTRATIVE EMBODIMENTS
Although the present invention, a method and system for voice exchange and voice distribution, is particularly well suited for use in connecting Internet users and shall be so described, the present invention is equally well suited for use in other network communication systems such as an Intranet, Extranet and interworking with traditional PSTN (Public Switched Telephone Network). While the present invention is particularly well suited for voice exchange it is equally well suited for any form of audio message exchange.
When the present invention, a method and system for voice exchange and voice distribution is accessed by a communication device through a non-packet link, the voice packet (voice container) is converted into the corresponding protocol and form necessary for communication with the communication device as well as to cross through the non-packet link.
Transaction Control Protocol/Internet Protocol (TCP/IP) is the communications standard between hosts on the Internet. TCP/IP defines the basic format of the digital data packets on the Internet allowing programs to exchange information with other hosts on the Internet.
Domain names direct where e-mail is sent, files are found, and computer resources are located. They are used when accessing information on the World Wide Web (Web or WWW) or connecting to other computers through Telenet. Internet users enter the domain name, which is automatically converted to the Internet Protocol
address by the Domain Name System (DNS). The DNS is a service provided by TCP/IP that translates the symbolic name into an IP address by looking up the domain name in a database.
E-mail was one of the first services developed on the Internet. Today, e-mail is an important service on any computer network, not just the Internet. E-mail involves sending a message from one computer account to another computer account. E-mail is used to send textual information as well as files, including graphic files, executable file, word processing and other files. E-mail is becoming a popular way to conduct business over long distances. Using e-mail to contact a business associate can be faster than using a voice telephone, because the recipient can read it at a convenient time, and the sender can include as much information as needed to explain the situation.
Simple Mail Transfer Protocol (SMTP) was developed to provide for reliable and efficient transfer of e-mail between different communication environments. SMTP is independent of a particular transmission subsystem and requires only a reliable data stream channel. The ability to relay e-mail between different communication environments is an important feature. SMTP is described in Internic RFC #821, entitled "Simple Mail Transfer Protocol" dated August 1982 (http://ds.internic.net/rfc/rfc821.txt), which is incorporated herein by reference.
A transport service provides an interprocess communication environment (IPCE). An IPCE may cover one network, span several networks, or a subset of a network. IPCEs are not one-to-one connections, but may communicate through another process, such as a mutually known IPCE. E-mail is a use of interprocess
communications. E-mail can be communicated between processes in different IPCEs by relaying them through a process connecting two or more IPCEs. Therefore e-mail can be relayed between hosts on different transport systems by a host on both transport systems.
The interconnection between different systems requires a standard for the format of e-mail messages. One such standard is described in Internic RFC #822, entitled "Standard For The Format Of ARPA Internet Text Messages" dated August 13, 1982 (http://ds.internic.net/rfc/rfc822.txt), which is incorporated herein by reference.
In 1989, researchers at CERN (The European Laboratory for Particle Physics) wanted to provide a better method for widely dispersed groups of researchers to share information. The researchers needed a system that would enable them to quickly access all types of information with a common interface. By the end of 1990, researchers at CERN had a textual browser and a graphical browser developed.
A browser is an application which knows how to interpret and display hypertext documents that are located on the Web. Hypertext documents contain commands, references and links to other text and documents. This allows a reader to quickly access related text. In addition to text, many documents contain graphics, audio and animation.
HTTP (HyperText Transfer Protocol) is an application-level protocol for distributed, collaborative, hypermedia information systems. It is a generic, stateless, object-oriented protocol which can be used for many tasks, such as name servers and
distributed object management systems, through extension of its request methods (commands). A feature of HTTP is the typing and negotiation of data representation, allowing systems to be built independently of the data being transferred. HTTP is described in a working document of the Internet Engineering Task Force (IETF), entitled "HyperText Transfer Protocol - HTTP/1.1" dated November 22, 1995, which is incorporated herein by reference.
HyperText Markup Language (HTML) is an authoring software language used to create Web pages. HTML is basically ASCII text surrounded by HTML commands in angle brackets, which are then interpreted by a browser. Standard Generalized Markup Language (SGML) is a text-based language for describing the content and structure of digital documents. SGML documents are viewed with transformers, which render SGML data the way Web browsers render HTML data. Extensible Markup Language, is a pared-down version of SGML, designed especially for Web documents. It enables designers to create their own customized tags to provide functionality not available with HTML.
A Uniform Resource Locators (URLs) is a pointer or link to a location. The URL contains a transmission protocol, such as HyperText Transfer Protocol (HTTP), a domain name of the target computer system, a page identifier and a bookmark.
The WWW is the graphical data transfer area of the Internet. This is the area of the Internet Home Pages and web sites are found. The WWW has become a popular place to advertise businesses, but it can also be used as a front end for electronic commerce (e-commerce). Many companies have on-line ordering on their web sites. While this segment of the web is not growing as fast as many analysts
predicted, it is still gaining wide acceptance as the public's trust of web security grows.
An Intranet is similar to the Internet except it is used to disseminate information within a company's network and is protected from the general public through the use of a Firewall. Sometimes, the users on an Intranet will have access to sites on the Internet, but unregistered users on the Internet do not have access to the Intranet.
An Internet Browser is a program that is able to read HTML and follow Hyperlinks in order to present the information included on a World Wide Web site. In addition, a browser has the capability of entering data on forms included on those web sites and has the capability to download information off of a web site. Most Internet browsers increase the speed of data transmission by sending downloaded data to a cache directory, where it can be accessed again the next time the data is requested rather than downloading it off the web site again.
On-line commerce, or e-commerce, uses the Internet, of which the World Wide Web is a part, to transfer information about goods and services in exchange for payment or customer data needed to facilitate payment. Potential customers can supply a company with shipping and invoicing information without having to tie up sales staff. The convenience offered to the customer is that they don't have to drive around town all day looking for the product they want.
An intelligent agent must have the capability to take actions leading to the completion of a task or objective, such as accessing security databases for validation
of credit card information, reading e-mail, determining status of a recipient of a message, validation of message addressing, etc., without trigger or input from an end- user. The details of the programming of the intelligent agent are known to those skilled in the art. The functioning and design of intelligent software agents are described in "Software Agents: An Overview" by Hyacinth S. Nwana, Knowledge Engineering Review, Vol. 11, No. 3 pp 1-40, September 1996 and "Intelligent Agents: A Technology And Business Application Analysis" by Kathryn Heilmann et al., URL: http://www-iiuf.unifr.ch/pai/users/chantem heilmann, 1998, which are herein incorporated by reference. Description of the Method
The present invention system and method for voice exchange and voice distribution between computers, telecommunication devices and Internet appliances provides the ability to communicate spontaneously, in the user's own voice, without the limitations of written communications for natural expression. In a broad overview, the present invention for voice exchange and voice distribution provides a voice intercom system with instant messaging, distributed over the Internet. The present invention is like a voice intercom system in that one of the parties in the conversation may speak or listen, but not both at once.
Referring to FIG. 1 there is illustrated a high level functional block diagram of the system for voice exchange and voice distribution. The present invention system and method for voice exchange and voice distribution 20 allows a software agent 22 with a user interface in conjunction with a central server 24 to send, receive and store messages using voice containers illustrated by transmission line 26 in a pack and send
mode of operation to another software agent 28. A pack and send mode of operation is one in which the message is first acquired, compressed and then stored in a voice container 26 which is then sent to its destination(s). In addition, the present invention for voice exchange and voice distribution provides the ability to store messages 30 both locally and centrally at the server whenever the recipient is not available for a prescribed period of time. Additionally, the present invention allows users to send and receive voice messages via convention analog phones 32 and 34 in which case the user's agent 36 is located remote to the user and preferably proximate to or integrated with the server. In this the remote agent allows manual or pre-programmed control of the origination, distribution and listening to these messages, and also offers the options of ringing a pre-configured phone number at the recipient's request for the delivery of the message or forwarding the message to another Internet or voice container enabled device.
With reference to FIG. 1A, the present invention is designed to adapt to the voice and data compression capabilities of the user's existing hardware and software platform. More specifically, the agent of the present invention may be adapted to work on a personal computer, wireless handheld computer such a personal data assistant (PDA), digital telephone, or beeper. In each case different voice and compression applications and data formats may be available as dictated by the hardware platform and software residing thereon. The present invention includes a voice/compression software detector 38 and 40 that communicates the format of the voice data to be transmitted and/or received.
Voice data is transmitted to the server in the format provided by the agent . Where the Personal Computer includes several voice compressions formats the agent may include a hierarchical list of preferred formats in which the most preferred format is selected. Criteria for selecting the format may include transmission bandwidth, lossy versus lossless compression and voice quality parameters such as sampling rates. The voice data is transmitted in a voice container. The term "voice containers" as used throughout this application refers to a container object that contains no methods, but contains voice data or voice data and voice data properties. In the latter case, voice data properties may be tailored to the use desired by the user or may be inherent from the voice data and/or hardware platform upon which the agent is reusing. For example, the agent when reusing from a PDA may only have one voice data format available. In later versions of the Windows 95, 98, 2000 and NT operating system by Microsoft, the GSM data compression or codec is included. The server is adapted to recognized the voice format of voice data contained in the voice containers, this information may be communicated by the agent prior to a voice container transmission, included in the voice container or provided to the server from the agent when polled by the server.
In the presently preferred embodiment, the data format available is provided to the server upon the initial session communication between the agent and the server.
Voice containers transmitted from a sending agent to a receiving agent have different data formats are routed through the server in which a translator 42 converts the voice data in the voice containers from the sender's data format to the receiver's data format.
Referring to FIG. 2 there is illustrated a high level overview of the system architecture. A Software Agent utilized by the sender of the voice container provides the following functionality: log on to a central server 46; authenticate to the central server 48; address the recipient(s) and pack message into a voice container or multiple voice containers 50; and, enable transport 52 of the voice container to the recipient or the central server.
A central server is made up of several sub-components including an authentication server 54, a message server 56, a notification server 58, a registration server 60, a proxy server 62, an OA&M Server 64, a capabilities exchange 66, a compression engine 68, a transcoding server (translator) 70 and transport server 72. Those structures are discussed in further detail below.
The Central Server provides the following functionality: register and authenticate the senders and receivers; control the identifiers of software agents; maintain and provide the status of all software agents; store the voice container if the recipients are not available; converts the voice container for PSTN (Public Switched Telephone Network); and, generate outgoing calls and emulate the software agent when the sender or recipient is a traditional phone or other voice container enabled device.
A Software Agent utilized by the recipient provides the following functionality: log on to the central server; authenticate to the central server; retrieve any undelivered voice containers; and, unpack the voice container and play the message.
The Connection Service Description
Software Agent - Sender: With a simple software agent loaded on a Personal Computer (PC) or other Internet compatible appliance, the sender will log on, authenticate, and notify the central server of its status. To create a message, the software agent will address, pack and send the message in a voice container.
Central Server: The central server in conjunction with the software agent controls, stores and switches the voice containers to the appropriate recipients. The server will initially register and authenticate the software agent. It will track and maintain the status of all software agents. It will notify the software agent to send the voice container directly to the recipient if the recipient is available or it will store the voice container for the intended recipient if the recipient is not available. In addition, it will also convert the voice container for delivery over traditional phone networks if the recipient is a phone or to other voice container enabled devices.
Software Agent - Recipient: If the recipient is not on-line, the messages will be transported to them when they log on to a network. The software agent will open the voice container upon arrival and play the message to the user.
To use the present invention system and method for voice exchange and voice distribution, the originator selects one or more intended recipients from a list of names that have been previously entered into the software agent. The agent permits a number of distinct modes of communication based on the status of the recipient. The status of all recipients entered into the software agent is frequently conveyed to the software agent by the central server. This includes whether the core states of whether the recipient is online or offline, but also offers related status information, for example
whether the recipient does not want to be disturbed. For online recipients, the software agent is also notified on the recipient's Internet Protocol (IP) address. Considering just the two core states, the software agent offers the originator alternative ways to communicate with the recipient. This choice can either be dictated by the originator or automatically selected by the software agent, according to rules that are stored. More than two choices are available when all the status information is considered.
If online, the originator can either begin a real-time "intercom" call which simulates a telephone call or a voice instant messaging session, which allows for an interruptible conversation. The choice of these modes depends on the activities of both parties, the intended length of conversation and the quality of the communications path between the two individuals, which is generally not controlled by either party. The previously stored IP address is used to enable direct, peer-to-peer communications.
If off line, the originator can either begin a voice mail conversation that will be delivered the next time the recipient logs in or can be delivered to the recipient's e- Mail as a digitally encoded MTME attachment. Again, the choice of delivery options is based on the interests of both parties and whether the recipient is sufficiently mobile that access to the registered computer is not always available. For these cases, the voice containers are delivered to the central server to manage the ultimate delivery to the recipient.
Once the delivery mode has been selected, the originator digitally records messages for one or more recipients using a microphone-equipped device and the software agent. The software agent compresses the voice and stores the file temporarily on the PC if the voice will be delivered as an entire message. If the real time "intercom" mode has been invoked, a small portion of the digitized voice is stored to account for the requirements of the Internet protocols for retransmission and then transmitted before the entire conversation has been completed. Based on status information received from the central server, the agent then decides on whether to transport the voice containers to a central file system and/or sends it directly to another software agent using the IP address previously stored in the software agent. If the intended recipient has a compatible active software agent on line after log on, the central server downloads the voice recording almost immediately to the recipient. The voice is uncompressed and the recipient can hear the recording through the speakers or headset attached to their computer. The recipient can reply in a complementary way, allowing for near real-time communications. If the recipient's software agent is not on line, the voice recording is stored in the central server until the recipient's software agent is active. In both cases, the user is automatically notified of available messages once the voice recordings have been downloaded to storage on their computer. The central server coordinates with software agents on all computers continuously, updating addresses, uploading and downloading files and selectively retaining voice recordings in central storage.
In all cases, the originator can include and reference other Internet and file based information, by including that in the data elements of the format. For example,
an e-Mail recipient could choose to reference the e-Mail but respond with voice to one or more addressees of the e-Mail. The forwarded recipients could either receive this in the software agent, which can portray the original e-Mail, or as a standard MTME attachment to e-Mail, or in both ways depending on administrative settings.
Limited by current dial-up bandwidth, voice containers are exchanged to enable users to experience a comfortable, but somewhat delayed, conversation. However, as bandwidth deployment increases via cable modems, high-speed subscriber lines, and other techniques, the conversational gaps are reduced and an even more natural sounding conversation results.
Interworking with other services
Telephone connections, using Touch Tone control codes, can emulate the basic service, allowing users on any telephone connection to send voice recordings to others or to receive their own recordings. Calls may be originated from the central server to one or more telephones, based on rules and preferences provided by the recipient, when a voice container is completed. The central server will transcode the voice component to commonly used network formats. It will then ring (or otherwise alert) the distant telephone and allow the individual who answers to either listen to the voice container or let it remain in storage. Moreover, the answerer can be given the option of speaking a voice destined for the originator, which will again be transcoded and returned to the subject system, for delivery to the originator. Finally, mobile users can call into the central server and request to hear messages pending delivery to their system address. The addresses may be assigned by individuals based on their
own choice of "name", allowing anonymous voice communications to occur. Similarly, voice recordings may be exchanged based on personal profiles of people with similar interests.
Interworking with other services the customer utilizes can be provided. This includes converting the present invention system and method for voice exchange and voice distribution voice containers to and from conventional voice mail services; attaching voice containers to e-mail messages; and converting e-mail text to voice for delivery by the present invention.
Example: Group Consultation
An important application of the present invention system and method for voice exchange and voice distribution includes the ability for large numbers of people to voice communicate with one person or others in a large group with high voice fidelity and either local or centralized control. A single person, such as a seller in an on line auction, can communicate with others in a controlled bidding group in natural voice communications. A visual presentation of messages from bidders allows the listener to hear them in any order, to repeat them to others participating in the conversation, or to flow the messages to one or more of the other participants. Either through manual means or through programmed means, this allows a near real-time dynamic exchange of information. Conventional telephonic solutions require both complex hardware and careful sound adjustments. They also offer limited control of who can hear and be heard. By using the address list and the rely function, the sender can select one or multiple recipients for the message.
Example: Multimedia Attachments
Another important application of the present invention system and method for voice exchange and voice distribution is attaching other media to the voice containers to provide a richer communications environment. For example, voice containers may have digitized greeting cards appended to them to present a personalized greeting.
The voice container has the ability to have other data types attached to it and thus be transported to the recipient. In one implementation example, , the voice container can formatted using industry standards such as Multipurpose Internet Mail Extension (MIME) format. This extension allows non-textual messages and multipart message bodies attachments to be specified in the message headers. MIME was developed and adopted by the Internet Engineering Task Force (IETF). The MIME protocol which is an extension of SMTP, covers binary, audio and video data.
Extensive technical information on the MTME protocol can be found in the following documents which are incorporated by reference: RFC 1342 MIME (Multipurpose Internet Mail Extensions): Mechanisms for Specifying and Describing the Format of Internet Message Bodies. N. Borenstein, N. Freed. June 1992; RFC 1344 Implications of MIME for Internet Mail Gateways. N. Borenstein. June 1992; RFC 1426 SMTP Service Extension for 8bit-MTMEtransport. J. Klensin, WG Chair, N. Freed, Editor, M. Rose, E. Stefferud & D. Crocker. February 1993; RFC 1428 Transition of Internet Mail from Just-Send-8 to 8bit-SMTP/MIME. G. Vaudreuil. February 1993; RFC 1437 The Extension of MIME Content-Types to a New Medium. N. Borenstein & M. Linimon. 1 April 1993; RFC 1521 MIME (Multipurpose Internet
Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies. N. Borenstein & N. Freed. September 1993; RFC 1522 MIME (Multipurpose Internet Mail Extensions) Part Two: Message Header Extensions for Non-ASCII Text. K. Moore. September 1993; RFC 1523 The text/enriched MIME Content-type. N. Borenstein. September 1993; RFC 1524 A User Agent Configuration Mechanism For Multimedia Mail Format Information. N. Borenstein. September 1993; RFC 1556 Handling of Bi-directional Texts in MIME. H. Nussbacher. December 1993; and, RFC 1563 The text/enriched MIME Content- type. N. Borenstein. January 1994.
Example: Voice Annotated Web Pages
Another application of the present invention system and method for voice exchange and voice distribution is permitting the recording of one or more voice packet messages on a personal computer, voice container enabled device or by telephone emulation to be heard on a networked computer. Another computer displaying a Web page with an embedded icon for voice container, can request the delivery of an appropriate voice recording, which in turn can be played through the PC speakers. The software agent can then accept voice response from the Web page viewer and allow a live or stored conversation to be exchanged. This process is enabled by the Web page directing an HTTP-encoded message to the Web server containing the UserlD of the Web page owner. To ascertain the UserlD of the requestor, the Web server enquires of the machine that contains the software agent
using existing Web commands This permits the proxy server to direct the Web owner's voice to the appropπate software agent.
An example is a real estate ad, in which an agent's voice explains important features of a property when the page is loaded on a PC. Additionally, more than one voice packet message can be tailored to the page for a visitor, based on information obtained at log in, through stored information in the personal computer, or based on dynamic information accumulated by the host service provider duπng a particular user's session For example, a real estate visitor might view small and large apartment ads At a particular site, a pre-determmed filter could ask for a voice packet message emphasizing the consumer benefits of the larger apartments when they are visited Alternatively, new messages can be quickly recorded and replace older messages if, for example, there is a pπce change or demand leaves only a few remaining units.
Example: Non-PC Devices
The voice recordings that are made via a microphone or converted by text-to- speech software can be used for many other purposes. These voice files can be played and recorded using voice container enabled devices These devices include Personal Digital Assistants (PDAs), microprocessor based appliances, such as set-top boxes, or audio play-back devices, such as MPEG Layer-3 (MP3) audio play-back devices. The result is a connection between the Internet and voice-playback devices useful for many practical applications including but not limited to. talking road maps - an Internet- generated voice road map played through a tape player or MP3 player, talking
calendars - an Internet-coordinated voice calendar reminder system; and, talking schedules - an Internet-driven voice scheduler, for wake up calls or TV programs.
Opportunities for Generating Value- Added Services
Today, some Internet services and features are offered at no cost, once a user has reached a Web site or downloaded a plug-in or software component. The principal incentive for this is to obtain revenues from activities related to site-visitor usage, such as: association with paid advertising; commission on "click-through" sales; usage or calendar-based subscription fees for upgraded service; and, the sales of user lists. The basic capabilities of the present invention system and method for voice exchange and voice distribution can be offered to all users without charge, through the downloading of the software agent. The central server will implement a set of controls that manage capabilities associated with revenue generating offers. These capabilities include: how long voice containers are stored in the central server; how quickly voice containers are accepted and delivered; the elapsed time of any given voice container; the number of parties that may simultaneously send and receive voice containers in a session; the degree of interworking between the present invention system and method for voice exchange and voice distribution and other systems; the inclusion of system- generated messages which present audible advertising that generate revenues for the service provider; and, the number and type of attachments that can accompany a voice container message.
Systems Architecture
Voice container structural components
Referring to FIG. 3 there is illustrated an exemplary embodiment of the voice container having voice data and voice data properties components. Voice container components include an originator's code 302 (which is a unique identifier), one or more recipient's code 304, originating time 306, delivery time(s) 308, number of "plays" 310, voice container source 312 which may be a PC, telephone agent, non-PC based appliance, or other, voice container reuse restrictions 314 which may include one time and destroy 316, no forward 318, password retrieval 320, delivery priority 322, session values 324, session number 326, sequence number for partitioned sequences, 328, repeating information 330, no automatic repeat 332, repeat times 334, and a repeat schedule 336. Additionally, the voice container will have information concerning codec type, size, sample rate, and data. The voice container will be sent using standard TCP/IP transport.
Servers
The central server consists of many components. These components can reside in a single physical hardware server or across multiple servers. No dependency exists on the operating system, hardware, database mechanism, or transport for the
server and the server components.
The registration server assigns the software agent a unique address. This address is used for all communications from the software agent to the server, it components and between other software agents. The address assigned will be maintained in a data store. Each software agent may have multiple e-mail addresses,
telephone numbers, name aliases, or other identifiers that may be associated with the unique id of the software agent.
The authentication server will permit or deny access to software agents based on the unique id of the software agent and a user name and password. The protocol between the software agent and the authentication server will be sent through a proxy server. This helps insure a high degree of security.
Authentication is currently assumed to be through an Open Database Connectivity (ODBC) mechanism to a Structured Query Language (SQL) server. Future solutions may be using Lightweight Directory Access Protocol (LDAP) Version 3.0. Authentication will be done in the authentication server.
The proxy server permits software agents accesses to backend servers and to retrieve/store voice files in the backend servers. The proxy server feature provides the capability to authenticate users through authentication servers and then switch incoming requests from the authenticated users to backend servers through a Firewall.
Software agents will gain access to the system through the log on process which interfaces with the notification server. Once authenticated, software agents will have access to the features and functions of the rest of the system. When a software agent has been authenticated all other software agents that are in the specific group or community of the authenticated software agent will be notified that the other agent(s) are on line. Should a software agent log off the system then a notification of such will be sent to all interested software agents. The software agent will notify the server with the Internet address that they are currently using for the session to identify where the messages should be sent.
The message server will be the repository for messages sent to software agents that are not logged onto the system. Once a software agent has been authenticated all messages that have been stored on the message server will be sent to the appropriate software agent. If a software agent is on-line, i.e. has been authenticated with the system and has notified other software agents via the notification server that they are on-line, the messages will be sent from each software agent to the other software agent that have their status set to available for receipt of messages. Another feature of the message server is the ability for the messages to be played or retrieved from other devices such as a telephone, PDA, the Web or sound enabled devices.
The voice containers will contain messages that have been recorded by a codec. In one embodiment, GSM is used as the default codec used for the system. Other codecs, such as G.723 and G.729 are also supported. Other codes may be used as they may be dependent upon the platform on which the agent is running on.
The Operations, Administration and Maintenance (OA&M) server will communicate with the software agent, telephone devices, and other Internet based agent to manage many functions and services. These will be detailed in the below.
The software agent will on log-in and authentication with the system negotiate the version number of the agent with the server. If the version of the software agent is older than the version that the server sees as the most current version than the newest version will be downloaded to the software agent and dynamically be replaced. This will be done as a background process or as a response to the user permitting the download to occur.
The server and the software agent communicate over a set of well-known ports. These are ports are known to the server and the software agent. It may become necessary for security, load balancing, firewalls or other purposes to change the port numbers. Port numbers will be able to be changed dynamically between the software agent and the server.
The OA&M protocol supports the capability of a guest log-in. At the tie of a guest log-in the server will download messages for viewing at the software agent. Special processing will occur on the software agent. This will be detailed in a later section.
The server will maintain a unique set of lists for each software agent. These lists will contain the identifiers of the other software agents that are permitted to send and receive voice containers and other media, types. The server will maintain the current list of agents and be able to create, delete, and modify those lists based on software agent requests or web based administration. A software agent will also have the ability to block of filter unwanted messages by sending a command to the server.
The software agent and the server will be able to set and manage the bandwidth and the number of sessions that they can manage. This will be based on the connection that is available to the device, the transport being used, and the size and basis of the messages that are being sent.
The software agent and the server will be able to set on a system wide or message by message basis the various privacy features of the messages. This will include the forwarding of messages from one agent or another and the denial of forwarding of the messages.
Each software agent that has been loaded and registered with the system will in addition to the standard codec used for the encryption and decryption of the voice containers detail the other codecs that the software agent may have access to on the system. The information about the codecs will be retained in the server. The information that will be retained includes the codec name, associated numbers, format, sampling rate, and number of bits.
Other types of voice containers may be delivered from the server to the software agent. These include advertisements, administrative messages from the present invention system and method for voice exchange and voice distribution, faxes, images, and interconnection with other real time services such as H.323. SIP, and
other services.
The server will also maintain the current address and/or priority of the delivery of the messages for the software agent and other devices that the message should be delivered to for the end user.
The message server will download all messages to the software agent and/or retain copies of the messages based on administrative settings from the user.
Capabilities Exchange
When the software agent registers with the server it will send information on the capabilities of the hardware and software of the system onto which it has been installed. The information will include all of the codecs, real time services, standards based services, and products that the software agent can interface and operate with on the device.
Transcoding Server
The server will have the ability to transcode the voice container that has been recorded with the default codec. Other codec formats may be supported on other devices. This will enable the other device to have the ability to play or record messages to registered user population in formats that they can decode. An example of this would be the ability to send a voice container that has been sent from the software agent to the server and then sent as an e-mail attachment to a larger end user population.
Software Agent
The software agent needs to be able to be operate under multiple operating system platforms. A user or company may want to block access to services provided by the present invention. If the service is blocked by a Firewall the security policy in place will be honored. The design should take into account potential software agents behind other firewalls and enable the software agent to communicate when a Firewall is in place with the server if permitted by the security policies of the Firewall.
Where the Firewall is administered to limit ports accessible to an external server, the software agent can be changed to use other available ports, most notably the ports used for generic request-response traffic for the World Wide Web.
The software agent-server-proxy protocol set of functions are a byte oriented, acknowledged protocol.
The transport mechanism for all communications will be over TCP/IP, Universal Data Protocol (UDP), and the PSTN between all software agents and the server. This will be dependent on the devices supported.
Software agents, proxies, and servers need to have the ability to have their ports administered.
Each software agent on the system will have a 32-character agentlD. The emaillD of the user can be the same as this agentlD. If the emaillD is not 32 characters, the software agent should appropriately expand it to a 32-character ID. During the registration process all of the known and supported codec as well as the default codec will be interrogated and reported to the server. This will be retained for coding optimization, transcoding, system performance, and quality going forward. Data report from the software agent and the host system include capabilities such as sound enabled, microphone enabled, and other compatible software programs and utilities that may exist.
The software agent will start a log on process with the authentication server through the proxy server every time the system is started, re-started, or if the user logs off from the system. The log-on process will consist of the user identifier and password that was established during the initial registration process. If the user is at the device or machine where the software agent id was originated the user will be able to perform all of the functions that the system provides. A successful log-in will result in all of the user messages waiting in the message server being downloaded to the software agent. The user may have elected to retain copies on the message server. If this is the case the messages will be retained at the server until they have been aged off by administrative settings.
When a software agent is located behind a Firewall, the agent can be administered to repeat the login process repeatedly over Firewall ports that are normally open, such as the port for the World Wide Web. In this case, the request from the software agent will be received by the proxy server, and all notification information (such as who else is on line in the users "buddy list") as well as downloading all available messages. In exchange, all agent-stored messages will be delivered to the server. The timing for this process can be dynamically adjusted so that the user perceives little or no delays for the exchange of voice containers with other online agents.
The user may also be an another machine where they can perform a guest log-in. By selecting guest log-on from the software agent and then inputting their user id and password the message server will download the messages to the software agent. When the user logs off, another log-on occurs, or after a specified time period the messages for the guest account will be deleted from the local device and software agent where they were downloaded.
Users will also be able to log onto the system by dialing into a telephone numbers and by using voice recognition, touch-tone entry or other means be able to retrieve their voice containers.
Authentication server refers to a software agent connecting through a proxy to the server used for verifying the userlD and password of a user trying to log onto the system. The software agent will authenticate through the proxy serer using a well defined secure protocol. The software agent will send a copy of the currently logged on Internet address to the notification server for purposes of notifying other software agents of its status and receiving messages.
The control messages from a client machine to the proxy server (which are forwarded by the proxy server to the backend server) are encrypted. Similarly, the messages from the proxy server to the authentication server/backend servers are also encrypted.
The proxy server may listen on several ports for connection requests from clients. Against each port the proxy server listens on, there will be only one authentication server to which the proxy sends the client information for
authentication. The authentication server should keep track of which user in its database belongs to which backend server. If a user is moved from one backend server to another backend server, the backend server should update the authentication server with the new information.
Multiple users cannot log in from the same machine with the same authentication password. The software agent, however, can be designed to allow different parties to change the UserlD and password and reuse the same machine, at different times.
After the software agent has logged onto the system and has been authenticated they will have access to the system. During the authentication process the Internet address of the newly authenticated software agent will be made known to all other interested software agents and retained in the proxy server. The notification process will also query the server to find out the other registered software agents that are currently logged onto the system and send the Internet address of the other logged on software agents to the authenticated, newly logged on software agent.
When a software agent logs off the system all remaining interested, logged on software agents will be notified that the software agent is no longer available.
Software agents may also be in other states that will be communicated to the other logged on software agents. These states would be the following: Available - available for messages or live talking; Do Not Disturb - available for messages but not live talking; Not Available - system is logged on but not accepting messages or live talk; Will return - Stepped out of the office and is accepting messages; Out to
Lunch - Stepped out to lunch and is accepting messages; Not logged on - Message will be sent to the message server
Forward to Telephone; Forward to PDA; and, Other states - defined as needed.
Messages will be created on the software agent using the default codec or another codec available on the system. This codec can be automatically selected based on service parameters. For example, less efficient codecs may be selected where they are known to have more universal support in some applications. Messages will be stored on the server until the software agent has gone on line and authenticated. Once authenticated all of the messages for that software agent will be sent to the software agent. A copy of the messages may be retained at the message server. Other software agents will be able to send messages directly to the software agent that has authenticated. This will be done on a peer-to-peer software agent basis. Multiple software agents may be sent a copy of the message. Software agents that are not logged onto the system will receive a copy in the message server.
Two types of messages will exist for the software agent. A delayed message that can be sent or received to/from other software agents and a live talk mode. The live talk mode will send the voice container to another software agent to be decoded in real time.
Both the live talk and messaging capabilities will be influenced by the state of the software agent as described in notification process section above.
Messages will also be able to be retrieved via the Web, telephone, set top box, and other devices. The message server can also send the voice container to a user that
is or is not using a registered software agent as an e-mail attachment. The recipient can be specified from within the software agent and can be part of the software agents' list without being registered as a user on the system.
Each message will have a unique identifier that will encode the sending software agents identifier, the destination software agents and non-registered users, the codec used, date and time of the message, the forwarding rules and permissions, body of the message, and whether the message was received, played, or deleted without listening. Since a message may go from one peer to another without the messaging server being involved a message will be sent to the server with all of the pertinent information about the message but not the body. This information will be used for monitoring of the service, guaranteeing service levels, and verifying end user software agent functions.
Referring to FIG. 4 there is a high level flow chart for PC to PC and PC to network communications utilizing the system for voice exchange and voice distribution. In FIG. 5 there is a high level flow chart for dial in emulation from a telephone utilizing the system for voice exchange and voice distribution. There can be seen in FIG. 6 a high level flow chart for spot calling utilizing the system for voice exchange and voice distribution. FIG. 7 shows a flow chart of an exemplary embodiment of the method and system for voice exchange and voice distribution with respect to the originator. FIG. 8 shows a flow chart of an exemplary embodiment of the method and system for voice exchange and voice distribution with respect to the central server. FIG. 9 shows a flow chart of an exemplary embodiment of the method and system for voice exchange and voice distribution with respect to the recipient.
FIG. 10 shows a flow chart of an exemplary embodiment of the method and system for voice exchange and voice distribution with respect to the originator of a voice spot. Referring to FIG. 11 there is shown a flow chart of an exemplary embodiment of the method and system for voice exchange and voice distribution with respect to the central server for a voice spot. FIG. 12 shows a flow chart of an exemplary embodiment of the method and system for voice exchange and voice distribution with respect to the recipient of a voice spot. Referring to FIG. 13 there is shown a flow chart of an exemplary embodiment of the method and system for voice exchange and voice distribution with respect to the originator and recipient for an anonymous voice communication. FIG. 14 shows a flow chart of an exemplary embodiment of the method and system for voice exchange and voice distribution with respect to the central server for an anonymous voice communication. FIG. 15 shows a flow chart of an exemplary embodiment of the method and system for voice exchange and voice distribution with respect to the central server for emulation through a telephone system. Referring to FIG. 16 there can be seen a flow chart of an exemplary embodiment of the method and system for voice exchange and voice distribution with respect to the originator of a voice container with multimedia attachments. FIG. 17 shows a flow chart of an exemplary embodiment of the method and system for voice exchange and voice distribution with respect to the central server for a voice container with multimedia attachments. FIG. 18 shows a flow chart of an exemplary embodiment of the method and system for voice exchange and voice distribution with respect to the recipient of a voice container with multimedia attachments. FIG. 19
shows a flow chart of an exemplary embodiment of the method and system for voice exchange and voice distribution with respect to preparing a voice container without a PC. Referring to FIG. 20 there is shown a flow chart of an exemplary embodiment of the method and system for voice exchange and voice distribution with respect to playing a voice container on a non-PC based appliance.
Numerous modifications and alternative embodiments of the invention will be apparent to those skilled in the art in view of the foregoing description. Accordingly, this description is to be construed as illustrative only and is for the purpose of teaching those skilled in the art the best mode of carrying out the invention. Details of the structure may be varied substantially without departing from the spirit of the invention and the exclusive use of all modifications which come within the scope of the appended claim is reserved.