Interested in idea number 16: Support Audio IO module

108 views
Skip to first unread message

Vishesh Khosla

unread,
Mar 7, 2020, 8:31:30 AM3/7/20
to opencv-g...@googlegroups.com
Hello everyone,

My name is Vishesh Khosla,I am 2nd year undergraduate student at Netaji Subhas University of Technology and I am currently pursuing Instrumentation and Control Engineering.My area of study revolves around the subjects of Electronics and Computer Science.Apart from that I am very proficient in C++(Data Structures and Algorithms) and Python.I have also been a Teaching Assistant for 4 months at Coding Ninjas(An online platform to learn Coding) for C++ and Data Structures.I am very much interested in the field of A.I(specifically ML,Deep Learning and Computer Vision).I also enjoy doing competitive programming. I got very much fascinated with the concept of computer vision and have been doing it with the help of OpenCV and I would like to contribute to the organisation.

Machine Learning and Deep Learning Background:
I have been doing machine learning and deep learning for the past 6-7 months in python.I have mastered the following concepts:Linear Regression,Logistic Regression,Decision Trees,Random Forests,Naive Bayes,Support Vector machine,Principle Component Analysis,Natural Language Processing,Simple Neural Networks,CNNs,RNNs.I have made some projects with the help of these which is present in my github account(link is present below).

Experience with DNN(for Audio):
I have studied about Audio Preprocessing(Short Time Fourier Transformation,Spectrogram,MFCCS) with the help of librosa and numpy and preparing a dataset for Music genre classification and then applying neural networks to predict the music genre,i was easily able to grasp these topics due to my good understanding in mathematics.

Vision and Understanding of the Project:
These are the following ideas that can be implemented through:
1.Music Genre Classification
2.Speech Recognition 
3.Music Instrument Classification
Since i  have a good hold over my English(both Verbal and written),i can contribute by making video tutorials.

Questions:
1.What are some other ideas that need to get implemented in the DNN module for Audio IO?
2.Is applying PCA on an audio file a good idea in order to compress the size of the testing data and increase the efficiency of the algorithm as there are less number of features?

CV:
Vishesh's Resume (3).pdf

Alexander Nesterov

unread,
Mar 10, 2020, 6:46:25 AM3/10/20
to opencv-gsoc-2020
Hello, my name is Alex. I'm going to be mentor for audio support task. 
1) We have plan audio support through change pipelines of videoio (for example gstreamer). Next step try check DNN module.
2) I don't know about it yet. We have to see what happens.
Do you work with gstreamer, ffmpeg, msmf or other media pipelines?

Vishesh Khosla

unread,
Mar 11, 2020, 1:35:12 PM3/11/20
to Alexander Nesterov, opencv-gsoc-2020
I have already worked with ffmpeg in order to perform some tasks like:
1.Changing video and audio file formats.
2.Extracting audio(mp3) from a video file.
3.Compressing an audio file by reducing the bitrate.
4.Muting a Video.
5.Changing the resolution of a Video.
6.Converting a sequence of images to a video.

--
You received this message because you are subscribed to the Google Groups "opencv-gsoc-2020" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencv-gsoc-20...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/opencv-gsoc-2020/ce4c1d91-2d0a-4896-afc6-17f102c63ffa%40googlegroups.com.

Vishesh Khosla

unread,
Mar 16, 2020, 7:04:23 AM3/16/20
to Alexander Nesterov, opencv-gsoc-2020
I  have familiarized myself with Gstreamer and learned about the elements of a pipeline(MUX,DEMUX,SOURCE,SINC) etc. .I learned to perform the following tasks using it:
1.Webcam Streaming
2.Audio Streaming(MP3,web)
3.OPUS IP audio streaming
4.Choosing which WASAPI sound card to use 
Apart from that i wanted some suggestions regarding the proposal.Like what all things should i be including  and please suggest me a mini project that i should do in order to enhance my chances of getting selected  in GSOC 2020.

Alexander Nesterov

unread,
Mar 16, 2020, 7:47:56 AM3/16/20
to opencv-gsoc-2020
Hello, you can explore Gstreamer pipeline in videoio module, because we would wanted  change it for audio support. And I advise write proposal base on this changes or propose your idea for change exist videoio functiniality.

Vishesh Khosla

unread,
Mar 19, 2020, 7:51:51 AM3/19/20
to Alexander Nesterov, opencv-gsoc-2020
Hello,I was going through the  Gstreamer pipeline in videoio module and i found out that in the class GStreamerCapture,the public function bool setProperty(int propId, double value),there is no code in case we want to zoom our video,so for the propId=27 i.e CAP_PROP_ZOOM.I wanted to contribute to the code(i know the logic of zooming a frame), please tell me how can i proceed in doing this?


On Mon, Mar 16, 2020 at 5:17 PM Alexander Nesterov <nes...@gmail.com> wrote:
Hello, you can explore Gstreamer pipeline in videoio module, because we would wanted  change it for audio support. And I advise write proposal base on this changes or propose your idea for change exist videoio functiniality.

--
You received this message because you are subscribed to the Google Groups "opencv-gsoc-2020" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencv-gsoc-20...@googlegroups.com.

Alexander Nesterov

unread,
Mar 20, 2020, 6:25:37 AM3/20/20
to opencv-gsoc-2020
I don't quite understand how video zoom depend with audio support?

Vishesh Khosla

unread,
Mar 20, 2020, 7:20:44 AM3/20/20
to Alexander Nesterov, opencv-gsoc-2020
Its related to video only 

On Fri, 20 Mar, 2020, 3:55 PM Alexander Nesterov, <nes...@gmail.com> wrote:
I don't quite understand how video zoom depend with audio support?

--
You received this message because you are subscribed to the Google Groups "opencv-gsoc-2020" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencv-gsoc-20...@googlegroups.com.

Vishesh Khosla

unread,
Mar 25, 2020, 10:21:23 AM3/25/20
to Alexander Nesterov, opencv-gsoc-2020
Hello,
I have shared my proposal on the google summer of code site with openCv please take a look at it and tell me the required changes i need to do.
 

Vishesh Khosla

unread,
Mar 28, 2020, 2:30:04 AM3/28/20
to Alexander Nesterov, opencv-gsoc-2020
Hello,
If you have read my proposal,please give me some feedback as time is running out.

Best Regards
Vishesh Khosla

Vishesh Khosla

unread,
Mar 28, 2020, 6:05:52 PM3/28/20
to Alexander Nesterov, opencv-gsoc-2020
Hello,
The final date of submission is on 31st March and time is running out pretty fast,it would be really helpful if i can get some feedback and suggestions about my proposal.
Best Regards
Vishesh

Alexander Nesterov

unread,
Mar 28, 2020, 6:38:36 PM3/28/20
to opencv-gsoc-2020
Hello, excuse me, I will review proposal at near time.

Alexander Nesterov

unread,
Mar 28, 2020, 6:51:16 PM3/28/20
to opencv-gsoc-2020
Hello, thank you, it looks a acceptable. You can add proposal in tool and wait for the results.

Vishesh Khosla

unread,
Mar 29, 2020, 12:53:45 PM3/29/20
to opencv-gsoc-2020

Hello,
By adding proposal in tool,do you mean changing the proposal tag or making a final submission of the proposal?
Reply all
Reply to author
Forward
0 new messages