Add a realtime ASR demo (both server and client) for DS2 users to try with own voice. #186

xinghai-sun · 2017-08-03T04:29:09Z

No description provided.

luotao1 · 2017-08-03T05:13:48Z

deep_speech_2/README.md

+
+### Playing with the ASR Demo
+
+A real-time ASR demo (`demo_server.py` and `demo_client.py`) are prepared for users to try out the ASR model with their own voice. After a model and language model is prepared, we can first start the demo server:


is prepared -> are prepared

luotao1 · 2017-08-03T05:15:16Z

deep_speech_2/data_utils/data.py

@@ -83,6 +83,23 @@ def __init__(self,
        self._rng = random.Random(random_seed)
        self._epoch = 0

+    def process_utterance(self, filename, transcript):
+        """Load, augment, featurize and normalize for speech data.


"""换一行再写注释，下同

The practice of

def function(): """This is function doc. """

follows the Google Coding Style for Python. And we keep this style consistent throughout the whole DS2 project.

luotao1 · 2017-08-03T05:20:17Z

deep_speech_2/demo_client.py

@@ -0,0 +1,94 @@
+"""Client-end for the ASR demo."""


开头这里用 # 来注释，可参考layers.py的注释方式。下同。

In layers.py, # is only used for copyrights declaration.
For file's doc, use """ instead.

…allation tips to README.md.

pkuyym

Great!

pkuyym · 2017-08-04T13:11:10Z

deep_speech_2/README.md

@@ -16,6 +16,19 @@ export LD_LIBRARY_PATH=$PADDLE_INSTALL_DIR/Paddle/third_party/install/warpctc/li

 Please replace `$PADDLE_INSTALL_DIR` with your own paddle installation directory.

+### Setup for Demo
+
+Please do the following extra installation before run `demo_client.py` to try the realtime ASR demo. However there is no need to install them for the computer running the demo's server-end (`demo_server.py`). For details of running the ASR demo, please refer to the [section](#playing-with-the-asr-demo).


run --> running
I think it's better to let ASR demo be a single section including setup and running instructions.

Move this to demo section.

pkuyym · 2017-08-04T13:18:10Z

deep_speech_2/model.py

@@ -35,6 +35,7 @@ def __init__(self, vocab_size, num_conv_layers, num_rnn_layers,
                             rnn_layer_size)
        self._create_parameters(pretrained_model_path)
        self._inferer = None
+        self._loss_inferer = None


self._cost_inferer be better?

No big difference between loss and cost. I prefer loss for it is more commonly used.

pkuyym · 2017-08-04T13:18:59Z

deep_speech_2/model.py

@@ -118,6 +119,24 @@ def event_handler(event):
            num_passes=num_passes,
            feeding=feeding_dict)

+    def infer_loss_batch(self, infer_data):


infer_batch_cost be better?

No big difference between loss and cost. I prefer loss for it is more commonly used.

pkuyym · 2017-08-04T13:20:41Z

deep_speech_2/model.py

+        """Model inference. Infer the ctc loss for a batch of speech
+        utterances.
+
+        :param infer_data: List of utterances to infer, with each utterance a


with each utterance --> each utterance consists of

--> with each utterance consisting of ....

pkuyym · 2017-08-07T03:06:03Z

deep_speech_2/README.md

+
+If you would like to start the server and the client in two machines. Please use `--host_ip` and `--host_port` to indicate the actual IP address and port, for both `demo_server.py` and `demo_client.py`.
+
+Notice that `demo_client.py` should be started in your local computer with microphone hardware, while `demo_server.py` can be started in any remote server as well as the same local computer. IP address and port should be properly set for server-client communication.


Maybe we should point out that accessing to remote server through the network should be make sured.

pkuyym · 2017-08-07T03:07:45Z

deep_speech_2/data_utils/data.py

@@ -83,6 +83,23 @@ def __init__(self,
        self._rng = random.Random(random_seed)
        self._epoch = 0

+    def process_utterance(self, filename, transcript):


transcript -> transcription, they have different meanings.

Both of transcription and transcript could refer to the text contents of speech. I use "transcription" for docs and "transcript" for code variables (due to the shorter length).

pkuyym · 2017-08-07T03:12:09Z

deep_speech_2/demo_client.py

+    elif len(data_list) > 0:
+        # Connect to server and send data
+        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
+        sock.connect((args.host_ip, args.host_port))


I think it's better to connect at first and reusing the connection rather than connecting every time when sending messages.

Opening a connection only cost several milliseconds. Besides, opening an independent connection for each utterance simplifies the codes (otherwise, in the server side, we have to handle multiple utterances with while-loop in a single handle() call).

pkuyym · 2017-08-07T03:16:23Z

deep_speech_2/demo_server.py

+    type=float,
+    help="The cutoff probability of pruning"
+    "in beam search. (default: %(default)f)")
+args = parser.parse_args()


Too many duplicated arguments with infer.py and evaluate.py, maybe we can refactor this part later. Mark here.

Yes. Let's discuss it later.

pkuyym · 2017-08-07T03:21:32Z

deep_speech_2/demo_server.py

+
+    def handle(self):
+        # receive data through TCP socket
+        chunk = self.request.recv(1024)


I think it's better to make 1024 be an optional argument.

No need for that. Chunk size does not matter.

pkuyym

LGTM

xinghai-sun added 4 commits August 3, 2017 11:28

Add a realtime ASR demo for users to test their own voice with mic.

ae84c6f

Add warming-up to demo_server.py for DS2 and clean codes.

aee3e11

Add function docs and comments to demo_server.py and demo_client.py.

a40c622

Add ASR demo usage to README.md for DS2.

d923a93

xinghai-sun requested review from pkuyym and luotao1 August 3, 2017 04:40

xinghai-sun mentioned this pull request Aug 3, 2017

Task List for DS2 on Paddle #176

Closed

17 tasks

luotao1 reviewed Aug 3, 2017

View reviewed changes

xinghai-sun added 2 commits August 3, 2017 19:00

Remove pynput and pyaudio packages from requriements.txt and add inst…

9955f05

…allation tips to README.md.

Add function doc for infer_batch_loss() function in model.py for DS2.

b9f89fa

pkuyym requested changes Aug 7, 2017

View reviewed changes

Update README.md with code reviews for DS2.

2cf6e7a

pkuyym approved these changes Aug 7, 2017

View reviewed changes

xinghai-sun merged commit 1da8f7a into PaddlePaddle:develop Aug 7, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a realtime ASR demo (both server and client) for DS2 users to try with own voice. #186

Add a realtime ASR demo (both server and client) for DS2 users to try with own voice. #186

xinghai-sun commented Aug 3, 2017

luotao1 Aug 3, 2017

xinghai-sun Aug 7, 2017

luotao1 Aug 3, 2017

xinghai-sun Aug 7, 2017

luotao1 Aug 3, 2017

xinghai-sun Aug 7, 2017

pkuyym left a comment

pkuyym Aug 4, 2017

xinghai-sun Aug 7, 2017

pkuyym Aug 4, 2017

xinghai-sun Aug 7, 2017

pkuyym Aug 4, 2017

xinghai-sun Aug 7, 2017

pkuyym Aug 4, 2017

xinghai-sun Aug 7, 2017

pkuyym Aug 7, 2017

xinghai-sun Aug 7, 2017

pkuyym Aug 7, 2017

xinghai-sun Aug 7, 2017

pkuyym Aug 7, 2017

xinghai-sun Aug 7, 2017

pkuyym Aug 7, 2017

xinghai-sun Aug 7, 2017

pkuyym Aug 7, 2017

xinghai-sun Aug 7, 2017

pkuyym left a comment


		### Playing with the ASR Demo

		A real-time ASR demo (`demo_server.py` and `demo_client.py`) are prepared for users to try out the ASR model with their own voice. After a model and language model is prepared, we can first start the demo server:


		If you would like to start the server and the client in two machines. Please use `--host_ip` and `--host_port` to indicate the actual IP address and port, for both `demo_server.py` and `demo_client.py`.

		Notice that `demo_client.py` should be started in your local computer with microphone hardware, while `demo_server.py` can be started in any remote server as well as the same local computer. IP address and port should be properly set for server-client communication.

Add a realtime ASR demo (both server and client) for DS2 users to try with own voice. #186

Add a realtime ASR demo (both server and client) for DS2 users to try with own voice. #186

Conversation

xinghai-sun commented Aug 3, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pkuyym left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pkuyym left a comment

Choose a reason for hiding this comment