Using Voice and Speech in Order Fulfillment

Download PDF

Voice directed Picking is receiving a lot of notoriety these days. This paper attempts to present a fair comparison of benefits and shortcomings of VDP (Voice Directed Picking), RFP (RF Directed Picking), and PTL (Pick to Light). The compared technologies all allow real-time pick decisions. Paper based picking is not included in the comparison because it does not allow significant real time optimization of the picking process.

Separation of a “Human (Worker) Interface” and Functionality

Many times when comparing voice directed picking to other pick technologies the distinction is blurred between the human interface and system functionality. In order to make a fair comparison, functionality must be separated from the human interface. With any of the compared technologies, it is possible to dynamically dispatch work and retrieve completion information from the worker. Vendors of any of the technologies may not actually provide dynamic work optimization, which may lead to a false conclusion when making any comparison. This paper is focused only on the human interface.

Human interface Overview

VDP—A portable terminal that primarily uses voice (aural) commands to direct the worker and primarily relies on speech recognition of the worker to obtain completion information

RFP—A portable terminal that primarily uses visual commands to direct the worker and primarily uses a scanner to obtain completion information from the worker

PTL—Fixed hardware that use visual commands to direct the worker and the worker uses buttons to report completion information

Although the primary interfaces are as described above, each of the technologies may use other means to communicate with the worker. For example, some VDP terminals may be equipped with a bar code reader, a PTL or RFP system may use aural communication through the use of a sound transducer, and a PTL system may also use RF terminals. For the purpose of the comparison, only the primary interface is considered for the given technology.

Worker (Human) Characteristics and Limitations

In conjunction with the picking device, the resources or characteristics of workers must also be considered. A crucial resource of a picker (or selector) is the worker’s hands. Pick-to-light systems and RF systems rely on the worker’s eyes. Voice systems rely on the worker’s ears and memory. Visual information is captured as needed by a worker—the worker may select relevant information as needed. Aural information is “lost” if not captured by a worker and must be re-requested. Another characteristic of humans is that their individual speech is subject to change. These factors need consideration in evaluation of a human interface.

The following table compares the three systems based on several features of high relevance in fulfillment systems (1 is not so good, 2 is good, 3 is excellent):

VDP RFP PTL
Pick Productivity in High Density Areas 2 2 3
Pick Productivity in Low Density Areas 2 3 1
Freedom to use both hands 3 1 3
Ability to get directions 2 2 3
Providing completion information 2 3 3
Initial system “training” 1 3 3
Simultaneous work at same location 3 3 1
Dependence on selector’s memory 1 3 3
Far distance identification of next location 2 3 1
Last-steps identification of next location 1 2 3
Reduction of pick errors 2 3 2
Correction of errors 2 3 1
Battery replacement 1 1 3

None of the compared systems directly reduce walking. The factors that affect productivity are related to the picking tasks once the worker is at the next pick location. However, in identifying pick locations, PTL becomes the best choice in high-density areas while RFP is the best choice for long travel distances. Walk reduction strategies (i.e.: batching or clustering) are not within the scope of this paper.

Some “trained” voice systems have problems dealing with workers not speaking normally (i.e.: workers with colds). Often these systems require re-training of the worker’s terminal.

In pick-to-light systems, selectors validate the pick by pushing confirm buttons. In voice systems, selectors read back check-strings. Long check strings negatively impact productivity. Short strings could create accuracy problems.

It is becoming common practice for voice systems dealing with long check strings and/or noisy backgrounds to provide workers with hand held scanners for product validation. Such a system is a hybrid of RFP and VDP and should be named “Voice Assisted Picking”. Regretfully, the primary VDP benefit of total availability of the worker’s hands is negated in these systems.

Low lighting environment pick areas can be problematic for RFP systems while noisy environments are normally not suitable for VDP systems.

The price of pick-to-light systems increases with the number of pick locations and it is independent of the number of workers. On the other hand, the price of VDP and RFP systems increases with the number of workers and it is independent of the number of locations.

A Final Note on Dynamic Optimization

Each of the considered technologies has the inherent capability of allowing real time decisions to direct the workflow. It is only through dynamic optimization that any of the systems can reach their full potential. Although not part of the consideration for selecting a human interface for the picking operation, dynamic optimization is the icing on the cake of the picking process.