代写辅导接单-COM3502-4502-6502

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top

1

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 1

COM3502-4502-6502

SPEECH PROCESSING

Lecture 1

Introduction

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 2

The Course

• Speech

– speaking and hearing

– acoustics and sound

– the nature of speech

– sounds and symbols

– phonetics

– phonology

– prosody

• Speech Processing

– signals and spectra

– sampling

– waveform processing

– the Fourier transform

– filters and

linear prediction

– cepstral analysis

1

2

2

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 3

Python & Jupyter Notebooks

• Python

https://www.anaconda.com/products/individual

Picture © 2017 Project Jupyter Contributors

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 4

Python & Jupyter Notebooks

• Python for exercises

– Jupyter Notebooks

– Anaconda

https://www.anaconda.com/products/individual

3

4

3

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 5

Recommended Text Books

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 6

Teaching Staff

Lecturer

Graduate

Teaching

Assistants

George L. Close

Robbie Sutherland Jason Clarke

Stefan Goetze

5

6

4

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 7

Logistics

Lecture material on Blackboard:

https://vle.shef.ac.uk

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 8

Logistics

• Lectures

– 20x 50 mins (with breaks)

– 2 per week

• Practical work

– Weekly lab sheets (first in week 2)

– Main programming assignment (~9 weeks work)

• Feedback/Interaction with Teaching Staff

– Ask questions during lecture! Beneficial for everyone

– Blackboard Discussion Group (lectures + practical work)

– Sample Solutions for Lab Sheets

– Staff contact details on Blackboard (https://vle.shef.ac.uk)

• Assessment

– Main programming assignment (worth 55%)

– Blackboard exam (worth 45%)

• Lecture notes (these slides)

– available on Blackboard prior to each lecture

7

8

5

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 9

What is Speech Processing ?

“… the study of speech signals and the

processing methods of these signals”

“… a special case of digital signal

processing applied to speech signals”

http://en.wikipedia.org/wiki/Speech_processing

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 10

What is Speech Processing For ?

2020: 9,200,000,000

1876: 2

https://www.independent.co.uk/life-

style/gadgets-and-tech/news/there-are-

officially-more-mobile-devices-than-people-

in-the-world-9780518.html

9

10

6

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 11

Speech Processing Technologies

X

Automatic

Speech

Recognition

X

Text-to-Speech

Synthesis

Spoken Language

Dialogue Systems

X

Digital Speech

Coding

X

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 12

Extracting Information from Speech

Speaker

Recognition

Words

“How are you?”

Language

English

Speaker

John Smith

Speech Signal

Accent

Recognition

Speech

Recognition

Accent

Sheffield

Language

Recognition

Emotion

‘happy’

Emotion

Recognition

11

12

7

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 13

Speech Enhancement

Speech

Processing

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

-1

0

1

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

-1

0

1

Speech

Noise

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

-1

0

1

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

-1

0

1

Speech+Noise

Recovered

Speech

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 16

Significant Market Penetration

13

16

8

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 18

Why Speech Processing ?

Lots of applications

… especially in Science Fiction !

Star Trek IV: The Voyage Home (1986)

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 19

Why use Speech ?

“Speech is the ‘natural’ way to

interact with your computer.”

Speech may be a more intuitive way of

accessing information, controlling

things and communicating …

but there may be viable alternatives

18

19

9

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 20

Why use Speech ?

Some alternatives can be problematic…

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 21

The Advantages of Speech

• hands-free

• eyes-free

• fast

• intuitive

“You have been learning since birth the only

skill needed to operate our equipment.”

Fran Capo

World Record Holder

603.32 wpm

Video source: https://www.youtube.com/watch?v=qv4nimqsajA

20

21

10

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 30

Robust Speaker-Independent Small-

Vocabulary Automatic Speech Recognition

Command & Control: Vehicle

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 36

Intelligence: Voice Stress Analysis

Vocal Emotion Detection

30

36

11

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 37

Processing: Speaker Localisation

Direction/Position Estimation

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 38

Processing: Special Effects

Vocal Manipulation

37

38

12

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 39

Processing: Audio Alignment

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 40

(Some)

Notation & Basics

39

40

13

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 41

Repetition: Real Numbers

• Natural numbers ℕ = {1, 2, 3, …}

• Whole numbers ℕ0 = {0, 1, 2, 3, …}

• Integer numbers ℤ = {…,

-2,

-1, 0, 1, 2, 3, …}

• Rational numbers ℚ = {a/b : a ∈ ℤ and b ∈ ℕ}

every number resulting from a ratio

• Real numbers ℝ (includes e.g. p = 3.14159…)

positive numbers

negative numbers

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 42

Complex Numbers

• Complex numbers ℂ

– Add “imaginary” dimension

– Real Part + ��� Imaginary Part

��� = ��� + ��� ���

– Adding by

interpretation as vectors

Real part

Re{���}

Imaginary unit

(sometimes ���)

Imaginary part

Im{z}

Complex

number

41

42

14

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 43

Continuous / Discrete Time

Continuous (time)

wave form

Discrete (time)

index

Continuous (time)

wave form

Discrete (time)

index

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 44

Vector Notation

• Scalars:

– Signals:

• Vectors:

– Signal vectors:

43

44

15

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 45

• Scalars:

– Signals:

• Vectors:

– Signal vectors:

• Matrixes:

Notation

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 46

Notation

Continuous

time domain

Discrete

time domain

Continuous

freq. domain

Discrete

freq. domain

Scalar

Vector

Matrix

bold capital

letters

bold

letters

(round)

parentheses

(round)

parentheses

(squared)

brackets

‘normal’

letters

discrete

signal

(time) continuous

signal

(squared)

brackets

45

46

16

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 47

• Scalars:

– Impulse responses (time-varying):

– Example for a (real) convolution:

• Vectors:

– Signal vectors:

– Impulse response vectors (time-varying):

– Example for a (real) convolution:

• Matrixes:

Notation – Part 1

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 49

• Auto and cross correlation with real, stationary

stochastic processes:

– Autocorrelation function:

– Cross correlation function:

– Auto-power density spectrum:

– Cross-power density spectrum:

Notation – Part 2

Continuous frequency index

47

49

17

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 50

This lecture has covered …

• What speech processing is

• What speech processing is for

• Speech processing technologies

• The advantages of speech

• Types of application

• Some math notation

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 51

Any Questions ?

Ask during lecture (e.g., now),

or post in the Blackboard Discussion group

50

51

18

© The University of Sheffield

COM3502-4502-6502 Speech Processing: Lecture 1, slide 52

Next lecture …

Sound

52

 

51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468