Download user manual - MikroElektronika
Transcript
user manual The SpeakUp is a speech recognition click™ board. You can set it up to recognize over 200 different voice commands and have the on-board MCU carry them out instantly. TO OUR VALUED CUSTOMERS I want to express my thanks to you for being interested in our products and for having confidence in MikroElektronika. The primary aim of our company is to design and produce high quality electronic products and to constantly improve the performance thereof in order to better suit your needs. Nebojsa Matic General Manager The STM32® and Windows® logos and product names are trademarks of ST microelectronics® and Microsoft® in the U.S.A. and other countries. Page 2 Table of Contents 1. Introduction 4 8.2. Getting Started 14 2. Applications 5 8.3. Creating a new project 15 3. Package Contains 6 8.4. New Voice Command 16 4, How to use it? 7 8.5. Configuring Project Settings 18 5. Tech Specs 8 8.6. Assigning An Action 20 6. Schematics 9 8.7. Uploading Project 22 8.8. Exporting Constants 23 7. How It Works? 10 SpeakUp Firwmware Algorithm 11 9. Direct Configuration 24 8. Configuration Through Software 12 10. Recording Tips 25 13 11. Examples 26 8.1. Typical Workflow Page 3 1. Introduction The SpeakUp is a speaker dependent speech recognition click board with standalone capabilities. You can set it up to recognize over 200 voice commands and have the onboard STM32F415RG MCU carry them out. It works by matching sounds with pre-recorded commands. Sound is received through an onboard microphone and then processed by a VS1053 IC with a built in stereo-audio codec. The SpeakUp comes with a dedicated software tool for easy configuration. The board is lined with 12 user programmable GPIOs for standalone functionality. It also carries a standard mikroBUS™ host socket. Easy configuration Over 200 commands Ultra fast operation Page 4 Standalone mode 2. Applications Wouldn't you rather issue verbal commands and have your machines comply, instead of pressing keys, pushing buttons and flipping switches all the time? There's a wide range of applications for the SpeakUp. Command your lights, doors and home appliances. Create voice commanded remotes for TVs or media centers. Reduce complexity and cost of control interfaces. When doing something with both hands and voice command is the only option. Page 5 3. Package Contains Package dimensions: L 70mm, W 60mm, H 30mm Box Package weight: ~40g User manual 1x8 headers Recycle Bin document SpeakUp click™ board Page 6 4. How To Use It? Before using your click™ board on your target platform, make sure to solder 1x8 male headers to both left and right side of the board. Two 1x8 male headers are included with the board in the package. 1. Prepare it 2. Configure it 3. Use it Turn the board upside down so that the bottom side is facing you upwards. Place shorter pins of the header into the appropriate soldering pads. Turn the board upward again. Make sure to align the headers so that they are perpendicular to the board, then solder the pins carefully. Now you need to train your SpeakUp to obey your commands. Plug in the board to your PC through USB cable. Configure it using the free software (see page 12). Alternatively you can configure the board directly using the on-board buttons (see page 24). The SpeakUp now understands your commands. Connect relays, motors or other electronic actuators directly to SpeakUp’s GPIO pins. Alternatively plug the SpeakUp into any board or shield carrying a mikroBUS™ socket. You can now control your devices with your voice. Page 7 5. Tech Specs 10.30 mm 405.50 mils Line out pads USB connector Microphone Audio jack 12 GPIOs (user programmable) Microcontroller (STM32415RG) 57.15 mm 2550 mils Audio Codec (VS1053) Signal LEDs mikroBUS connector Push-buttons JTAG connector 25.40 mm 1000 mils Along with its key components, the SpeakUp packs other useful bits like two buttons for recording or deleting voice commands manually, while three signal LEDs give recognition feedback and indicate power. Page 8 6. Schematics C3 C4 C5 C6 100nF 100nF 100nF 100nF 100nF 2.2uF D1 10uF PMEG3010ER U3 1 VCC-3.3V IN OUT 2 C34 GND 3 EN ADJ 2.2uF 5 AP7331-ADJ R35 C43 10uF 100K R36 27K4 GND 3 VCC-1.8V R33 4 5 IN OUT 4 EN ADJ R34 287K R37 39K AP7331-ADJ 2.2uF C2 U4 1 2 C35 VCC-3.3V C1 C33 VCC-3.3V VCC-USB R38 OR C7 1K VCC-3.3V R39 10K LD2-PC12 LD1-PB2 LD2 R2 4K7 LD1 R1 4K7 10 C10 C11 C12 47nF 10nF 10nF R28 10K RST# SPI1-CS# SPI1-SCLK SPI1-MISO SPI1-MOSI C44 100nF PWM INT TX RX SCL SDA +5V GND MIKROBUS DEVICE CONN. R30 10K 48 47 46 45 44 43 42 41 40 39 38 37 VCC-3.3V IO7-PA2 IO8-PA1 IO9-PA0 IO10-PC2 IO11-PC1 IO12-PC0 PWM INT UART3-TX UART3-RX I2C1-SCL I2C1-SDA HD2 1 2 3 4 5 6 7 8 SW1-PB10 R23 10K R24 C32 100nF 10uF C9 3.3nF R4 100K R G L R11 100K VCC-3.3V 36 35 34 33 32 31 30 29 28 27 26 25 R16 1K R19 27 MP3-MISO MP3-MOSI MP3-SCLK C14 100pF MICP MICN C16 100pF R18 1K C15 100PF C41 C42 10uF 10uF VCC-3.3V 2 5 6 3 4 1 R20 1K CN2 SJ- 43516-SMT R21 1K MIC1 R22 10K 1 MICROPHONE 1M X1 SW1 C18 18pF VCC-3.3V SW2-PD2 10uF C39 VCC-USB C20 100nF R31 10K C38 2 VCC-3.3V R25 10K VS1053 GPIO4 GND GPIO1 GPIO0 XTEST CVDD3 SO SI SCLK TX RX GPIO5 GPIO MP3-DREQ MP3-MOSI MP3-MISO MP3-SCLK MP3-CS# XDCS/BSYNC IOVDD1 VC0 DGND1 XTAL0 XTAL1 IOVDD2 DGND2 DGND3 DGND4 XCS CVDD2 R17 10K MICP/LN1 MICN XRESET DGND0 CVDD0 IOVDD0 CVDD1 DREQ GPIO2 GPIO3 GPIO6 GPIO7 3.3nF C8 MP3-CS# IO2-PB0 IO1-PB1 LD1-PB2 SW1-PB10 R29 10K AN RST CS SCK MISO MOSI +3.3V GND 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 IO6-PC9 IO5-PC8 IO4-PC7 U2 MICP MICN MP3-RST# MP3-DCS TMS-SWDIO USB-D_P USB-D_N USB-ID USB-DET LN2 AGND3 LEFT AVDD2 RCAP AVDD1 GBUF AGND2 AGND1 RIGHT AVDD0 AGND0 C13 2.2uF 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 C17 2.2uF VCC-3.3V R15 10K R10 470 10 R7 10 R8 10 R14 10 10 R13 R9 R6 470 10 R5 R12 RIGHT LEFT GBUF VCC-1.8V GPIO STM32F415RG VDD VCAP2 PA13 PA12 PA11 PA10 PA9 PA8 PC9 PC8 PC7 PC6 PB15 PB14 PB13 PB12 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 IO1-PB1 IO2-PB0 IO3-PA3 IO4-PC7 IO5-PC8 IO6-PC9 VDD VSS PB9 PB8 BOOT0 PB7 PB6 PB5 PB4 PB3 PD2 PC12 PC11 PC10 PA15 PA14 VBAT PC13 PC14 PC15 PH0 PH1 NRST PC0 PC1 PC2 PC3 VSSA VDDA PA0 PA1 PA2 IO3-PA3 1 2 3 4 5 6 7 8 IO9-PA0 IO8-PA1 IO7-PA2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 PA3 VSS VDD PA4 PA5 PA6 PA7 PC4 PC5 PB0 PB1 PB2 PB10 PB11 VCAP1 VDD RST# IO12-PC0 IO11-PC1 IO10-PC2 HD1 VCC-3.3V U1 JTAG VCC-3.3V RIGHT GBUF LEFT VCC-3.3V 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 2 TMS-SWDIO 4 TCK-SWCLK 6 8 10 RST# MP3-DREQ SW2-PD2 LD2-PC12 UART3-RX UART3-TX MP3-DCS TCK-SWCLK I2C1-SDA I2C1-SCL VCC-3.3V SPI1-CS# SPI1-SCLK SPI1-MOSI SPI1-MISO 1 3 5 7 9 CN4 MP3-RST# INT PWM 1uF VCC-3.3V 12.288MHz USB-DET USB-D_N USB-D_P USB-ID C19 18pF C21 100nF VCC-3.3V SW2 C40 10uF VCC-1.8V C22 100nF C23 100nF Page 9 C24 100nF C25 100nF C26 100nF C27 100nF R26 220 C28 100nF C29 100nF C30 100nF C31 100nF VCC-3.3V R3 2K2 LD3 FP1 FERRITE CN3 1 2 3 4 5 VBUS DD+ ID GND USB MINIB 1 2 3 HD3 7. How It Works? What gives the SpeakUp its speech recognition capabilities is the firmware we developed for the on-board MCU. It’s based on the DTW algorithm, which makes it decisive, it turns your talk into action almost instantly. Input: Sound is received through an on-board microphone. There’s also a 3.5mm jack for connecting an external microphone. Between the mic and the MCU sits a VS1053 IC with a built in stereo audio codec to process the raw signal. Output: After the processed sound has been forwarded to the STM32F415RG MCU that interprets the voice command, there are two output options which can be utilized at the same time or separately: STANDALONE MODE: On-board MCU directly controls external devices using 12 user programmable GPIOs CLICK™ MODE: Sends index of the matched voice command to a selectable interface: USB or UART. Page 10 SpeakUp Firmware Algorithm The main goal of a speech recognition system is to substitute a human listener, although it is very difficult for an artificial system to achieve the flexibility offered by human ear and human brain. The work principle of speech recognition systems is roughly based on the comparison of input data to prerecorded patterns. These patterns can be arranged in the form of phoneme or word. By this comparison, the pattern to which the input data is most similar is accepted as the symbolic representation of the data. It is very difficult to compare raw speech signals directly. Because the intensity of speech signals can vary significantly, a preprocessing on the signals is necessary. This preprocessing is called Feature Extraction. First, short time feature vectors are obtained from the input speech data, and then these vectors are compared to the patterns classified prior to comparison. The feature vectors extracted from speech signal are required to best represent the speech data, to be in size that can be processed efficiently, and to have distinct characteristics. The SpeakUp Firmware uses Dynamic Time Warping (DTW) algorithm - word-based, isolated word, speaker dependent and template matching algorithm : In the word based speech recognition the smallest recognition unit is a word In the isolated word recognition, words that are uttered with short pauses are recognized, Speaker dependent reference patterns are constructed for a single speaker, Template matching algorithm is a form of pattern recognition. It represents speech data as sets of feature/parameter vectors called templates. Each word or phrase in an application is stored as a separate template. The input speech is then compared with stored templates and the stored template most closely matching the incoming speech pattern is identified as the input word or phrase. Page 11 8. Configuration Through Software The SpeakUp software configuration tool is a free PC application for configuring the SpeakUp click board. With it, you can configure the board to recognize over 200 different voice commands and have the on-board MCU carry them out instantly. You can download the software from the following link: http://www.mikroe.com/downloads/get/2077/ speakup_app.zip The software is designed with ease of use and simplicity in mind. The UI is based on tabs and drop-down menus requiring no programming skills to use. Still, it has all the essential features and options that give you full control of the set-up process. Page 12 8.1. Typical workflow Create a new project or open existing manually Launch the app New project created or last one loaded automatically First time you launch the app a new project is created automatically. Otherwise, the last project you were working on will open. Add or Edit voice commands Everything OK? YES Assign actions Upload Close Page 13 NO Adjust Settings 8.2. Getting Started Connect the SpeakUp click board to the computer via the USB cable. It will be recognized as a USB Human Interface Device (HID) in the Device Manager of the Control Panel. Once you connect the SpeakUp to your computer you’re just a few clicks away from configuring it. The set-up process is dead simple. Launch the application, and it will lead you through the initial steps of recording and assigning commands. Page 14 Ambient Noise Detection After the successful connection, the SpeakUp click™ board will perform ambient noise detection and calibrate itself. The process lasts about 10 seconds. It’s done when the red signal LED turns off. After that the board is ready for recording voice commands. You can set custom calibration parameters for any subsequent usage in the Project Settings (see page 18). 8.3. Creating A New Project To create a new project, press the Create New Project button from the main toolbar of the SpeakUp software. A new window will open, where you can enter your project’s name and destination folder (if the destination folder doesn’t exist, the software will prompt you to create it). To finish project creation after inputting the required information, press the Create button. Alternatively, you can choose to open the settings menu as soon as you create a project, by checking the appropriate box. Page 15 8.4. New Voice Command Add a voice command Record it Stay within the time limit Hear it back To record a new voice command, press the Add New Voice Command button. A New Voice Comand dialog window will appear. Press the Record button. The length of the recording is set in the Project Settings window (see page 18). The recorded command will be played back automatically, so you can make sure it’s OK. Page 16 Name it and save it You’re done! Troubleshoot If you’re satisfied with the recording, enter a name for your command and click the Save & Close button. The recorded command will appear as a new tab. You can play it back, edit or delete it anytime. If the SpeakUp fails to detect a voice command, your surrounding might be too noisy. Try again by speaking a bit louder. If it still doesn’t work, launch Settings and adjust the Noise treshhold. Page 17 8.5. Configuring Project Settings To configure project settings, press the Open Settings Window button and the Settings window will open. General Settings In the General Settings you can configure the SpeakUp’s functionality: Acceptance threshold: This is the parameter you should adjust to define how closely your delivery has to match your pre-recorded command. At lower values, you’ll have to deliver the command precisely the way you recorded it. At higher values the matching doesn’t have to be so precise, but this increases the probability that the SpeakUp will pick up irrelevant speech and interpret it as a command. You should be able to reach the sweet spot value through some trial & error. Recording timeout: Timeframe in which the SpeakUp click board expects recording input after the record button is being pressed. User can choose between 5, 10 and 15 seconds timeframes. Word Length: Length of the voice command being recorded, in seconds. Can be 1, 1.5, 2, 2.5 and 3 seconds Page 18 Noise level: Minimal sound volume level that can trigger a voice command recognition. Lower values require quieter pronunciation, resulting in higher noise/hiss sensitivity. On the contrary, higher level values require louder pronounciation and they are less sensitive to noise/hiss. We recommend that you keep auto detection enabled. That way the SpeakUp Click board will measure the noise level, and perform noise calibration automatically. Auto detection can last a bit longer, usually around 10 seconds. Sudden Pin Aliases And Initial Pin States In this section, you can rename GPIO pins according to your needs and set their starting conditions. The new GPIO pin aliases will be applied in the main window too. Set the corresponding initial GPIO pin state in the Initial Pin States section. Condition can be either low (logical 0) or high (logical 1). Page 19 changes in sound levels will lengthen the time of calibration and will result in improper sound level values. Notify master: Notifies the master (MCU or PC) when the voice command is recognized by sending a 16-bit index number of voice command via chosen communication interface (UART or USB). Data rate: Sets the speed used for sending data to the master (MCU or PC). 8.6. Assigning An Action When a new command is recorded, it is time to assign it an action. The action will be performed when the voice command is recognized. Also, a 16-bit index number of the voice command will be sent via chosen communication interface (UART or USB) . There are five types of action that can be assigned : NONE: When this option is selected, no action will be performed on the corresponding GPIO pin upon voice command matching. ON: When this option is selected, a corresponding GPIO pin will be set to logical high state upon voice command matching. OFF: When this option is selected, a corresponding GPIO pin will be set to logical low state upon voice command matching. TOGGLE: When this option is selected, a corresponding GPIO pin state will be toggled upon voice command matching. PULSE: When this option is selected, a train of pulses will be sent to the corresponding GPIO pin upon voice command matching. Page 20 Pulse parameters The pulse parameters can be set in the Pulse Parameters window (click on the Edit pulse parameters icon Edit pulse parameters to open it) : A period (T) is the time it takes for a signal to complete a single cycle (sum of the high state and low state time periods). Duty ratio (D) is the percentage of T in which a signal is active, i.e. ratio of the high state period and a complete period. N is the number of times the pulse is repeated. Thus, a 60% duty cycle means the signal is ON 60% of the time period but OFF 40% of the time period. Page 21 8.7. Uploading Project When you’re finished recording and configuring voice commands, it is time to upload the project to the SpeakUp click™ board. This is done via the Upload Project button. You can monitor the upload process in the Toolbar. After it’s done, an appropriate message will be displayed in the Status Bar. Page 22 8.8. Exporting Constants Each recorded voice command is given an index number which is sent to the host MCU. You can export voice command names and their indexes as constants. The exported document will be in the form of a source file (in any of the three languages), as shown below. mikroC mikroBasic mikroPascal /* This file is generated by SpeakUp Software. It containts voice commands constants. Creation date: 4/3/2014 Creation time: 11:20:09 AM Name: Turn ON Program A Index: 0 Length: 0.0 s Description: Turns on Program A */ const VCMD_TURN_ON_PROGRAM_A = 0; /* Name: Turn ON program B Index: 1 Description: Turns on Program A Length: 0.0 s */ const VCMD_TURN_ON_PROGRAM_B = 1; Page 23 9. Direct Configuration You can perform some basic configuration directly on the SpeakUp without using the software. Different combinations of button presses will allow you to record, re-record or erase commands. You’ll get feedback from the on-board LEDs. However you won’t be able to assign specific actions with this method. On-board push-buttons On-board LEDs Use push-buttons to operate the board: Two indicator LEDs provide the following signals: 1 Push-button 1 - To record your voice command, press and hold the button while speaking. You must stay within the time limit for each command (default settings: 1 second). You can also record multiple commands at once by pronouncing them one by one while keeping the button pressed. Just make sure to wait for the red LED to flash between pronouncing subsequent commands. Proceed in this way for as many commands as you need. Each command will be assigned a unique index. Amber LED - the board is ready for recording or listening. Push-button 2 - If you press it for more than 2 seconds, all recorded voice commands will be erased. 2 Red LED - the board is perfoming an operation. When the voice command is recognized, both LEDs are lit for a half a second. Standalone mode default settings: 1 2 1 If both push-buttons are pressed for more than 2 seconds, the SpeakUp click board will reset. Page 24 2 Acceptance Threshold: 15 Recording Timeout: 5s Word Length: 1s Noise Level: Auto Notify Master: USB 10. Recording Tips Here are some general recording Tips : For better recording results, it is necessary to provide conditions with lowest amounts of ambient noise and speaker distance from the microphone in the range from 10 to 20cm. If there are problems with the voice command detection, please record it several times due to the pronounciation diversity. It is mandatory to play back the recorded voice command in order to hear if some ambient noise was recorded also. Because of this, it is recommended that the SpeakUp click™ board is placed on a surface that doesn’t transfer mechanical vibrations. This is a speaker dependent system. If there are more users, each person should record voice commands separately, due to the pronounciation diversity. Number of voice commands that can be recorded depend on their lengths, typically more than 200 for the voice command length of 1 second. Please keep in mind that the recording is performed by the SpeakUp click™ board, not the computer, so there is no need to connect an external microphone to the computer. Page 25 11. Examples SpeakUp has a world of applications. It’s up to your imagination to come up with the coolest ideas. Here’s a hint or two: Use SpeakUp on top of the Pi click Shield to command XBMC Home Media Center on Raspberry Pi®. It’s a great a replacement for a mouse and a keyboard. Replace your lamp switch with a SpeakUp click and a relay. Tell your light to turn ON or OFF if your hands are busy doing something important. Page 26 DISCLAIMER All the products owned by MikroElektronika are protected by copyright law and international copyright treaty. Therefore, this manual is to be treated as any other copyright material. No part of this manual, including product and software described herein, may be reproduced, stored in a retrieval system, translated or transmitted in any form or by any means, without the prior written permission of MikroElektronika. The manual PDF edition can be printed for private or local use, but not for distribution. Any modification of this manual is prohibited. MikroElektronika provides this manual ‘as is’ without warranty of any kind, either expressed or implied, including, but not limited to, the implied warranties or conditions of merchantability or fitness for a particular purpose. MikroElektronika shall assume no responsibility or liability for any errors, omissions and inaccuracies that may appear in this manual. In no event shall MikroElektronika, its directors, officers, employees or distributors be liable for any indirect, specific, incidental or consequential damages (including damages for loss of business profits and business information, business interruption or any other pecuniary loss) arising out of the use of this manual or product, even if MikroElektronika has been advised of the possibility of such damages. MikroElektronika reserves the right to change information contained in this manual at any time without prior notice, if necessary. HIGH RISK ACTIVITIES The products of MikroElektronika are not fault – tolerant nor designed, manufactured or intended for use or resale as on – line control equipment in hazardous environments requiring fail – safe performance, such as in the operation of nuclear facilities, aircraft navigation or communication systems, air traffic control, direct life support machines or weapons systems in which the failure of Software could lead directly to death, personal injury or severe physical or environmental damage (‘High Risk Activities’). MikroElektronika and its suppliers specifically disclaim any expressed or implied warranty of fitness for High Risk Activities. TRADEMARKS The MikroElektronika name and logo, the MikroElektronika logo, the click boards™ are trademarks of MikroElektronika. All other trademarks mentioned herein are property of their respective companies. All other product and corporate names appearing in this manual may or may not be registered trademarks or copyrights of their respective companies, and are only used for identification or explanation and to the owners’ benefit, with no intent to infringe. Copyright © 2014 MikroElektronika. All Rights Reserved. Page 27 If you want to learn more about our products, please visit our website at www.mikroe.com If you are experiencing some problems with any of our products or just need additional information, please place your ticket at www.mikroe.com/support/ If you have any questions, comments or business proposals, do not hesitate to contact us at [email protected]