Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

_ Convert Audio (.wav) files to MIDI ! [ http://www.digital-ear.com ]

77 views
Skip to first unread message

Epinoisis Software

unread,
May 19, 2001, 9:48:40 AM5/19/01
to

Digital Ear® for Windows 2000/Me/98/95

http://www.digital-ear.com/

» What is Digital Ear® ?

Digital Ear can analyze a recorded monophonic performance  (e.g. a singing voice, saxophone solo, or any other musical instrument) and convert it into a standard MIDI file! That file played by any synthesizer with a different  voice of your choice, or it can be imported to your favorite sequencer (e.g. Cubase VST, Cakewalk etc. ) for mixing with other tracks and further processing. Digital Ear reads standard PCM audio (.wav) files.

» Beyond simple «Pitch-to-MIDI» Conversion...

Unlike conventional so-called «Pitch-to-MIDI» converters, Digital Ear will send high-resolution pitch events closely matching those of your original sound. Any vibrato, tremolo, pitch-bend, or portamento effects of your recorded sound will be faithfully converted and reproduced into any voice of your synthesizer.

» Let the Magic Begin!

NEW!

Even if you are a novice user, our proprietary Settings WizardTM technology will help you find quickly and easily the optimal settings for any musical instrument. Things couldn't be easier for you!

» The Voice Realism is here…  

A unique feature of Digital Ear not found elsewhere, is the capturing of detailed volume envelope and timbre dynamics events. These features can really boost your synthesizer's voice realism and enhance your musical expression.

Your MIDI files will never sound the same again.

» Digital Ear® version 3.0 Key Features

  • State-of-the-Art recognition engine. Based on the latest psychoacoustical research on human pitch perception. Captures with incredible accuracy and speed (30 times faster than real-time) instantaneous pitch, volume, and timbre dynamics, with minimal errors.

  • Full-featured built-in Voice Features Editor. View your voice features as they evolve over time with an advanced graphical representation (virtual keyboard, chart, sliders). Edit the pitch, volume and brightness of the sound at any time-slice with  accurate and quickly.

  • NEW! Settings Wizard: This is a advanced feature of Digital Ear 3. It allows you to find automatically the optimal settings for a particular audio file for best conversion results without trouble.

  • Completely customizable to match every musical instrument or human voice. Store an unlimited number of user-defined engine settings.

  • Smart Event Detector: This brand-new feature of Digital Ear uses a new real-time algorithm in order to accurately recognize a  notes' attack. This feature is particularly effective for string instruments. On-the-fly real-time sensitivity adjustment.

  • Ultra high time resolution (10 ms frame size minimum).

  • Two types of MIDI files ensure full compatibility with all known sequencers.(Such as Cubase VST, Cakewalk e.tc.)

  • Power Tools:

  • NEW!Auto Correct. Will clean up for you most tracking errors automatically.

  • NEW!Transposer: Transpose your sequence with 1/100 of a semitone accuracy.

  • Pitch Quantize. Quantize your track to the nearest semitone values.

  • NEW! Soft Quantization: User selectable natural sounding quantization.

  • In-Tune Wizard. Will tune for you automatically a de-tuned melody, without altering the performance dynamics.

  • Integrated MIDI and Wave file player. Sends your MIDI file to any user selectable any MIDI device. Preview your wave files without leaving Digital Ear.

  • NEW! MIDI controller redirection. Select any MIDI controller to send brightness and volume (expression) events.

  • Unlimited MIDI voice selection. Select any of the 128 GM (General MIDI) voices.

  • Full automated for GM (General MIDI) and Yamaha-XG compatible synthesizers. Also supports MIDI synthesizers that do not conform to these standards.


__________________________________________________________________________
 
Epinoisis Software, the Digital Ear Team
 
E-mail : in...@digital-ear.com
Web:
http://digital-ear.com/
Fax: +1-(360)-237-6361
Voice Mail: +1-(360)-237-6361
__________________________________________________________________________

Hugh Jardon

unread,
May 19, 2001, 11:16:09 AM5/19/01
to
Wow thats gay. there is freeware out that does the same thing.
probably better (its free). it also works on some polyphonic riffs

Richard White

unread,
May 19, 2001, 12:04:59 PM5/19/01
to
In article <3b068dc7.776700@news>, sho...@home.com (Hugh Jardon) wrote:

> Wow thats gay. there is freeware out that does the same thing.
> probably better (its free). it also works on some polyphonic riffs

So tell us what, where and how?

Richard White
--
Hear Linda Ronstadt sing Richard White on
ÅšA Merry Little ChristmasÄ… Elektra #62572-2/4
CDs: ÅšMusic for GuitarÄ… and ÅšMusic for Woodwinds and Piano'
available at: http://www.mp3.com/richardwhite
http://listen.to/richardwhite whi...@flash.net

Yeah, Right

unread,
May 19, 2001, 12:22:30 PM5/19/01
to
This must be GREAT stuff, here...incredible technology...it's 30 times FASTER than real time!  Wow! 
 
And it's even supports GM/GS/Non-GM compatible synths!  OOOOOO!  I must have this!
 
 "Epinoisis Software" <info@digital-e ar.com> wrote in message news:9e5tk4$lc$1...@netnews.upenn.edu...

Nick Thomson

unread,
May 19, 2001, 2:07:24 PM5/19/01
to
Digital Ear is the TOP of the audio to MIDI
converters today.

No other software comes even close capturing
the full nuance of the instrument. Thanks guys!

As for the freeware stuff: I would like to know
too where can I find that free polyphonic
converter... I guess I will have to wait for
long time.


"Epinoisis Software" <in...@digital-ear.com> wrote in message news:<9e5tk4$lc$1...@netnews.upenn.edu>...


> Digital Ear® for Windows 2000/Me/98/95
>
> http://www.digital-ear.com/
>

[snip]

Hugh Jardon

unread,
May 19, 2001, 5:39:30 PM5/19/01
to
On 19 May 2001 11:07:24 -0700, seek...@hotmail.com (Nick Thomson)
wrote:

i forget whats its called sorry. i believe i grabbed it from
hitsquad.com

Ian Shatwell

unread,
May 21, 2001, 4:41:11 AM5/21/01
to
sho...@home.com (Hugh Jardon) wrote in message news:<3b06e7c2.3096478@news>...

> On 19 May 2001 11:07:24 -0700, seek...@hotmail.com (Nick Thomson)
> wrote:
>
[Usual DE hyperbole snipped]

> >
> >As for the freeware stuff: I would like to know
> >too where can I find that free polyphonic
> >converter... I guess I will have to wait for
> >long time.
> >
> >
> >[snip]

> i forget whats its called sorry. i believe i grabbed it from
> hitsquad.com

There are a few packages available from www.hitsquad.com/smm. I would
recommend WaveGoodbye (www.btinternet.com/~irshatwell/WaveGoodbye),
but by all means consider me biased since I am the author. This is far
from perfect but
does a reasonable job with piano music. (It's hopeless at guitar
recognition
though.) Beta releases of version 2 should be ready in a few months
which will
be much more accurate than the current version.
Bear in mind that polyphonic recognition cannot be done in the same
way as
monophonic and is much MUCH harder. There are no packages, free,
shareware
or commercial, which will do a perfect job. For a comprehensive list
see the
alt.music.midi FAQ which is posted to this group about every two
weeks. This
includes a link to many free packages and demos of the commercial
systems.
The only way of getting an accurate audio to midi conversion is 'by
ear',
if you have the talent. Even the better polyphonic recognition
packages
will only save you about a third of the time, by providing a good
baseline
to start from.

PeopleNet Software

unread,
May 24, 2001, 10:21:01 AM5/24/01
to
Hello,

To demonstrate the difference with similar wav to midi software
have a look at a completely different example (it is not even music!)
namely chirp.wav and chirp.mid at our audio demos page:

http://www.digital-ear.com/midi.htm

To our knowledge there is no other software that can
accomplish this.

Regards,

Epinoisis Software

unread,
May 24, 2001, 5:00:57 PM5/24/01
to

Epinoisis Software

unread,
May 24, 2001, 9:26:44 PM5/24/01
to
> And no one other than Epinoisis who will stand up and say he can get
> this sort of performance from the software.

Yes, because only our software can do this. Why shouldn't we say that?

>
> Now, after looking at the demo page let me say this. There are a
> number of MID files there allegedly converted from the posted WAV
> files. If folks will download those MID files and actually look at
> them they will find less than 1%, usually less than 0.5%, of the
> content of most of those MID files is NOTE events and the rest is
> volume and pitch wheel events.

So what? If an instrument is played LEGATO style this is the only
correct way to render it realistically as a MIDI file. Anything else
will sound silly.

> This makes the resulting files a
> mediocre soundalike to the original WAV files, but makes them
> essentially uneditable.

mediocre? We would say incredibly close to the original (and
some times *better*, depending on the synth you have). Of
course with a cheap soundcard do not expect miracles!

unedible?

First of all the file can be edited INSIDE Digital Ear
with high level of detail (no other software offers this)

Second: any decent sequencer can
-edit pitch events,
-change the tempo,
-change the timbre,
-feed the BRIGHTNESS events to any controler
-transpose it

[the list could be endless]

Do you still call this unedible?

> I can't touch up THOSE particular disasters unless I spend a lifetime
> doing so.

You must be the only one that calls those demos
at http://www.digital-ear.com/midi.htm "disasters".

People at FUTURE MUSIC MAGAZINE call them
"So close to the original that is SCARY"

Do you work for a competitor of Digital Ear :-) ?

Regards,
--

Dr.Matt

unread,
May 24, 2001, 9:37:06 PM5/24/01
to
I think it would be funny if MIDI turned out to be a new data
compression format. But on what synth?


--
For spammers: http://www-personal.umich.edu/~fields/uce.htm
My CD "Kabala": http://www-personal.umich.edu/~fields/cd.html
Matt Fields DMA http://listen.to/mattaj TwelveToneToyBox http://start.at/tttb
"Is there a theorbo in the house?"

Unknown

unread,
May 24, 2001, 10:53:36 PM5/24/01
to
On Fri, 25 May 2001 04:26:44 +0300, "Epinoisis Software"
<in...@digital-ear.com> wrote:


>
>Do you work for a competitor of Digital Ear :-) ?
>

err..if yours is the only software that can do this,then surely you
don't have any competitors ;)


Nick White

unread,
May 25, 2001, 8:39:58 AM5/25/01
to
On Fri, 25 May 2001 04:26:44 +0300, "Epinoisis Software"
<in...@digital-ear.com> wrote:


>So what? If an instrument is played LEGATO style this is the only
>correct way to render it realistically as a MIDI file. Anything else
>will sound silly.

Not so. A Guitar played legato has a certain amount of attack as each
note is started by the fingers hitting the frets, even though not
plucked. A Piano played legato is the same. Even a Sax or Trumpet will
have a certain tonal change as the new note is started. Fretless
strings such as Violin, Fretless bass etc, OK mostly, although again a
Violin can be played lagato, with the player actually fingering the
note changes, not simply sliding the hand up the neck of the violin.
That is in fact how most Violin is played AFAIK.

I found (in my 2 second trials of the version I tried) that because
the note was held and bent, it sounded wimpy, or "slippery" in many
cases, and there was no way to "retrigger" (as the fingers hitting the
guitar frets would do to a certain extent. You can make a guitar keep
playing simply keeping on moving your fingers on the fret board.) I
commented on this way back, but I don't think it was taken any
further.

Also, by stretching the note up or down with bend, instead of using
new notes, many synths WILL sound silly as the "donald duck/giant
voice syndrome" takes over, because a new sample is not used..

>
>> This makes the resulting files a
>> mediocre soundalike to the original WAV files, but makes them
>> essentially uneditable.
>
>mediocre? We would say incredibly close to the original (and
>some times *better*, depending on the synth you have). Of
>course with a cheap soundcard do not expect miracles!
>

hmmmm... a MIDI rendition better than the thing it's copying....bah!
to quote a phrase from Jim Higgins.

>unedible?
>
>First of all the file can be edited INSIDE Digital Ear
>with high level of detail (no other software offers this)
>
>Second: any decent sequencer can
>-edit pitch events,
>-change the tempo,
>-change the timbre,
>-feed the BRIGHTNESS events to any controler
>-transpose it
>

But very few of these work with a single note 2 bars long, as far as
being able to alter parts of the note.

>[the list could be endless]
>

...and so is the work of trying to tune a Bended note, to get the
right effect

>Do you still call this unedible?
>

.... only essentially uneditable....

>
>Do you work for a competitor of Digital Ear :-) ?

Ah! Here we go again! This is what got my goat enough to reply. I was
going to ignore the whole thing.

Is it the same old package being trotted out, or a new programme
altogether? Do you have a _working_ demo? Is it more than 2 seconds
long?

I STILL say that you have a product that is soooo close....to being
usable by _some_ people, in _some_ situations. But you have thoroughly
kept it to yourself by way of a really laughable demo, so you are not
getting feedback from independant people who _really_ try the thing (I
am afraid I do not count magazines, and never have).

The demo was way too short and did not allow various tweeks that MAYBE
would allow some of the apparent shortcomings to be overcome.

Apart from the vituperation in which I was involved over this issue, I
and others did make many constructive suggestions, which were ignored
or refuted in the most part.

Nick White --- HEAD:Hertz Music


(please remove ns from my header email address to reply)
....damn spam


!!
<")
_/ )
( )
_//- \__/


Epinoisis Software

unread,
May 25, 2001, 9:31:59 AM5/25/01
to
----- Original Message -----
From: "Nick White" <nsnf...@iinet.net.au>

> I found (in my 2 second trials of the version I tried) that because
> the note was held and bent, it sounded wimpy, or "slippery" in many
> cases, and there was no way to "retrigger" (as the fingers hitting the

There are two ways to "retrigger" using Digital Ear:
----------------------------------------------------------------
1) Automatically: Using the smart attack detector (this works fine on
instruments like guitar)

2) Manually: By pressing the "Delete" key at the point you want to
trigger a new event.

> Also, by stretching the note up or down with bend, instead of using
> new notes, many synths WILL sound silly as the "donald duck/giant
> voice syndrome" takes over, because a new sample is not used..

This would occur if you go beyond one octave. BUT, the maximum
pitchbend range is 12 semitones (and you can still adjust this as
SMALL as you like, you can even have a range of +/- one semitone
and retrigger as frequently as you like...)

But in any case this is a synthesizer limitation and not Digital Ear's.

> Is it the same old package being trotted out, or a new programme
> altogether? Do you have a _working_ demo? Is it more than 2 seconds
> long?

At least we have a demo. Autoscore sells WITHOUT any demo.

Its the product nature that does not allow us to give more length.
We wish to have another choice. Sorry we cannot go further
than a few seconds in a demo. If you are so suspicious about
the effectivenes of Digital Ear you can split a wave file in pieces
and see if it works. This is NOT for everyday use, it is only
there to prove that the software works. If you plan to use it
proffesionaly you buy the Full version. Thats the idea.

If you have any other suggestions we are open to discuss.

By the way, our new demo has all the power tools available.

> Apart from the vituperation in which I was involved over this issue, I
> and others did make many constructive suggestions, which were ignored
> or refuted in the most part.

We always hear to people comments and most of the new features
of version 3.0 are based on user suggestions. We also now have a big
user base of people that use Digital Ear in everyday basis in studios
who give us very valuable feedback.

Epinoisis Software

unread,
May 25, 2001, 9:38:41 AM5/25/01
to
"Dr.Matt" <fie...@login.itd.umich.edu> wrote in message
news:6HiP6.1182$3n.5...@news.itd.umich.edu...

> I think it would be funny if MIDI turned out to be a new data
> compression format. But on what synth?
>

We quote some feedback on this subject that we received from
a Digital Ear user:

"....I have always been fascinated with the idea of being able to deliver
full-length musical pieces that sound very much like they are performed by
humans on real instruments (because in many ways they are), yet have very,
very low transmission bandwidth requirements. In fact, I would like it if
someone were to come up with a streaming midi format, because right now I
believe that most products require the entire file to be transmitted before
playback begins. But MIDI is, of course, an almost ideal transmission format
for music, except for the obvious shortcomings w.r.t. instrument qualities
and effects proccessing: perhaps one day there will be a more agreed upon
standard for transmitting patch settings and effects settings as well.

Actually, my interest extends to a whole suite of standards that can be
applied to ultra-compressed transmission of media. You see, the mpeg (or
any other signal domain to spectrum domain) compression standards are
limited by information theory to a certain maximum compression ratio. Sure,
25:1 is pretty good. But it will never get much better than that because of
the inherent limitation in what you can and cannot throw out when using a
generalized approach of going from raw signal to spectrum and back again.
The best one can ever really hope for is about 50:1 from the wavelet version
of this approach, which is still not good enough for many applications and
requires a ridiculous amount of bandwidth for high definition signals. Yes,
even wavlet technology has this limitation and will only ever be about twice
as good as conventional DCT techniques (although this is nothing to laugh at
for a genralized approach). Most of the compression using signal to
spectrum encoding comes from:

1.) For each frame of the signal, analyzing the relative power in each
spectral range and simply discarding data about ranges whose power is
significantly lower than the average for that frame (i.e. will not
contribute significantly to the percieved nature of the signal anyway).
2) analyzing entropy so that if the same range in several contiguous frames
changes very little (below some threshold), then that range is lossy run
length encoded for that set of frames.
3.) analyzing entropy so that interframe range deltas, rather than
individual range samples, are sent where LRLE cannot be applied effectively.
4.) simply discarding data about ranges that represent frequencies above a
certain level.

However, this assumes that you require a completely generalized compression
algorithm. In other words, mpeg (or any DCT based encoding scheme) is the
most general way to approach highly compressing ANY signal and still getting
a pretty low perceived error between the original and reconstituted signal.
However, as well as a great many reasons to use signal to specturm encoding
for compression and many reasons why signal to spectrum encoding is
generally the best approach, there are also great many limitations here.

Simply put, mpeg (and DCT in general) is the most common and usually the
best approach to lossy compression because it can be applied consistently
and without regard to source or meaning of the underlying signal to be
compressed. Also, the lossy aspect of this approach seems to equate to
similar perceived levels of loss in the quality of the resulting
reconstituted signal, regardless of what kind of data the reconstituted
signal is used for (still pictures, motion pictures, audio, etc).

However, taking such an approach assumes that we have no intrinsic
understanding of what the signal really represents. As such, we are limited
by the laws of general information theory as to the maximum compression
ratio that can be obtained without unacceptable loss of fidelity in the
reconstituted signal. MIDI is an excellent example of how much better we
can do if we discard the opaque nature of an original signal and say instead
that we are completely aware of the model that is generating it. That is,
MIDI can be thought of as one application of model based compression. I say
model based compression here to differential between it and signal based
compression, as used in mpeg and other DCT compression schemes.

MIDI demonstrates that if one takes a model based approach to compression,
ratios can easily be another 20-200 times better than the best results that
can possibly be achieved by generalized signal based lossy compression. In
other words, MIDI regularly achieves a 500-1000:1 or better compression
compared to transmitting raw 44.kHz 16 bit stereo audio signals. For
instance, it is not uncommon to be able to fit a moderately complex (lots of
slurs, dynamics, etc) 5 minute piece into a 100K MIDI. And this can be then
usually be compressed at 2-3:1 via standard non-lossy techniques, whereas at
128 kbps MP3 fidelity the same piece requires 4800K bytes or between
48-144 times more data, depending on whether or not you want t to factor in
the additional possiblity of zip type compression. Taking into account that
in reality the receiving MIDI renderer may have a full 0-20K Hz fidelity and
may use as accurate a set of rendering algorithms as we like (in theory
anyway), it in fact would take 512 kbps MP3 fidelity to come close to
matching this, meaning that MIDI can actually be considered closer to
200(raw)-600(with zipping) times better at compression of music than a
generalized signal compression scheme could ever hope to be. Indeed, as the
quality of receiving MIDI rendering device increases, this effect increases
without any further demand on the MIDI format itself, so a rendering device
with 96 kHz bandwidth and 24 bit sample precision could be considered a way
of increasing the perceived compression ratio of MIDI by another factor of
3, resulting in ratios (with zipping included in the pipeline) of about
600-1800 times better than the generalized lossy signal compression
approach!

Of course, where this all breaks down is when the real model for making the
music (or any other media) is a superset of the model we have decided to use
for encoding. For instance, if we decide to write a concerto for fog horns
and broken champagne glasses, it will be too much to ask of the poor MIDI
standard to faithfully reproduce this without an agreement to also send the
data that will allow the recipient's MIDI rendering tool to make these
sounds. Similarly, if we write a concerto for prepared piano, it will
probably be almost impossible to expect any MIDI rendering device to
faithfully reproduce this without sending a detailed sample of what each
piano string is supposed to sound like, in addition to the MIDI file. In
fact, the MOD format is a very early and primitive format designed to
overcome just such limitations. CSound, with the addition of the ability to
convert MIDI sources to native CSound score files, goes a long way toward
solving this problem as well, as long as composers and/or sound architects
are willing to invest a great deal of time and effort into creating
efficient non-sample-based orchestra definitions to reproduce the
instruments they wish the music to be performed on. Much effort has been
expended in this area, and synthesis based on the physical modeling of real
acoustic systems is finally starting to pay off somewhat.

But let's take this a step further. What about speech? What about video
conferencing? What about generalized audio? What about generalized motion
picture streams?

Well, lets first consider speech. If we use the most simple model based
compression for speech, we will simply send text and presume that the
receiver has a very intelligent renderer that can convert this to a
convincing stream of spoken word. In this case, we are competing with 96
Kbps mpeg which does a great job on single voice data streams. And let's
say that the average speaker has a rate of 120 word/minute (this is pretty
arbitrary...I really have no idea, but 2 words/sec seems about right) with
an average of 6 characters + 1 space per word. Then transmitting speech as
text achieves 737280/840 or 877 times better compression than equivalent
mpeg fidelity. Of course, this omits the nature of the speaker' voice and
the inflections they use...I am assuming some pretty sophisticated
software/hardware on the rendering end to fill in these gaps. But lets
assume that a SVML (Spoken Voice Markup Language) or a compact XML DTD is
developed which overcomes the limitations of missing speaker type and voice
inflections. Regardless, I am willing to bet that such a system could
easily perform 100 times better than any generalized lossy signal
compression technique. And given that there is really no limitation on the
quality produced at the rendering end and that text/markup streams often
approach 20-100:1 ratios when fed through zip compression engines, I would
expect that the ultimate perceived compression ratio would be far, far
better...approaching ~20,000-90,000 better for plain text than the best that
any generalized lossy signal compression technique could offer and closer to
the 2,000-10,000 times better range for a full phoneme markup language.

Similarly, if we limit ourselves to a small subspace of video signal
transmission--the video conferencing scenario--then model based compression
techniques can once again be used to great effect. Because most talking
heads are very similar and most movement in a video conference is limited
to the lips, eyes, and surrounding facial muscles, then each renderer can
store a basic model of the skull, jaw, and muscles that are common to each
speaker. All that needs to be transmitted are an initial set of deltas that
define each speaker's deviation from the basic model and cosmetic niceties
such as hair style/color, eye color, etc., and then a stream of data
describing the current translation of the speakers head from the norm and
the deviation in muscle tension from the relaxed or initial state for each
tracked muscle group. The speech data itself need be only what I described
in the preceding paragraph. Since the fidelity of rendering is not limited
by the transmission medium but rather is dependent upon the quality of the
rendering hardware and software on the recieveing end, once again I would
expect perceived compression ratios to be around 100s to 1000s of times
better than the best that any generalized lossy signal compression technique
could offer. After all, once the initial model customizations are sent, one
need only transmit a few hundred 16 bit integer parameters per frame, or
somewhere on the order of a raw 12000-30000 bytes per second. Given that
standard compression techniques should yield anywhere from 2-100:1 ratios on
this data (it is very redundant after all), this results in a requirment of
anywhere between 15000 bytes/second to 120 bytes/second. And when you
consider that the data can be rendered to any resolution without graininess
(unlike the case of signal based compression), the perceived compression
ratio could be in effect astronomically greater than with any generalized
lossy signal compression technique. Indeed, using AI techniques, much
higher level (far more compressed) gestural data could be transmitted and
features for various speaker entities could be "remembered" between sessions
so that transmission of model customizations could be significantly reduced
and much higher level data could be sent than individual muscle tensions,
resulting in even lower bandwidth requirements.

Now let us consider the case of generalized audio compression. Surely, all
the work that has gone into sound synthesis using physical modeling
techniques will not be wasted here. Just as an instrument or a speaker's
larynx, voice box, etc. can be simulated in this manner, so can any sound
source. And surely, the reverse holds true. That is, given a sound source,
find the best physical model for it and the parameters for the physical
model that describe how the sound changes with time. Of course, this is way
off in the outer realms of theoretical development and has its own set of
limitations (i.e. what about for sounds that we have never heard before?
Can we really develop an algorithm which can decide for any sound stream
what set of physical models and what set of parameters for each model is
the set best used to reproduce those sounds with the lowest required amount
of transmitted data?). But the argument is still a valid one in this case.
Sure, the problems may seem insurmountable at this time, but this method
can, in theory, be applied to achieve compression ratios that are once again
in the 100s to 1000s of times better than any generalized lossy signal
compression technique. And this is simply because we choose not to put the
data in a black box and consider it to be just a general signal with no
underlying meaning. As soon as we ascribe meaning to the data and conceive
of a model for how the data came to be, it is almost certain that we will do
much better than any generalized approach can ever hope to. This is
especially the case when we can depend on both the transmitter and the
receiver to have a rather large set of predefined models to choose from so
that model negotiation can proceed by simply specifying the index or name of
the models to be used rather than transmitting a full description of each.
Of course, this does not preclude the ability of the transmission protocol
to allow sending full model descriptions where that is required when a
receiver does not already have a model for the data a sender is trying to
transmit.

Finally, let us turn our attention to generalized motion picture streams.
Is it possible to apply a model based approach here? The problem again
seems insurmountable. There are just too many possibilities to
handle....too many models to efficiently represent every possible source of
visual data...or are there? I would suggest that already there has been
significant research and development in this area (see:
http://www.cs.ubc.ca/labs/imager/contributions/forsey/dragon/stor.html as an
example of state of the art 6 years ago! Things have progressed quite a big
since then, but are kept under wraps more than they used to be). Certainly,
work in the area of H-spline representation of three dimensional data goes a
long way toward this. The essence of H-spline representation is that a
hierarchy of splines is used to represent surfaces. In this way, only the
most significant features of an object need to be transmitted and very
sophisticated transformations can be applied by simple adjustments to the
appropriate feature at the appropriate level in the hierarchy. Because
alterations in high level features "trickle" down through the hierarchy into
the rendering of lower level features, very natural organic transformations
can be effected with relatively little data (a small number of splines and
control point transform descriptions), as compared to traditional facet
(point, line, face, connection list) based techniques. And due to the
special mathematical nature of H-splines, a far greater degree of control
and locality can be achieved over previous non-linear modeling techniques
such as traditional B-splines or NURBS (Non-Uniform Rational Besier
Splines). Although we are far from attaining any widely workable level of
generalized model based compression for motion picture or 3D data
transmission, there is a great deal of hope and promise in this area.
Certainly, in theory it is possible, and the compression ratios that can be
achieved are sure to be on the same order as those discussed for music,
speech, audio, and video conferencing...."

Best Regards,

Epinoisis Software

unread,
May 25, 2001, 2:33:52 PM5/25/01
to
> I didn't say "no software vendor other than Epinoisis;" I said "no
> one." I'm willing to amend that to "no one so far (other than
> Epinoisis) with a real name and real e-mail address." Understand now?

This is completely YOUR judgement. Have a look at our
webpage at: http://www.digital-ear.com/reviews.htm

To see real people, real companies like YAMAHA supporting Digital Ear.
ALL the negative postings come a couple of people (including you.) who
do not even own the software.

We have HUDRENDS of satisfied customers. Do you want their oppinion?
You can find most of them in our forum:

http://www.digital-ear.com/forum.htm

> Nonsense! Even instruments played legato have an attack when a new
> note is sounded. The junk files your software produces are a few
> notes and a bunch of long slides, with the notes often sustained far

Anyway we cannot continue arguing about this. The demos are there
and people can can judge by themselves. Period.

> note to the next even if played legato. Let's get serious here! If
> I've made any error in my assessment, it is in assuming that you've
> portrayed the capability of your software in the very best possible
> light using files that display all its capabilities.

The files are there to demonstrate the CONSTRAST with similar
software which simply cannot do this.

> you did? So put up some files with instruments played STACCATO and
> without pitch bends.

Do you really beliveve that this would be a problem? Technicaly, both
files legato or staccato are very same for Digital Ear.


> the usual MIDI editing software. Your software produces six or seven
> real note events with notes sustained well beyond the duration a real

We cannot follow you here. Six or seven real note events?

> I may be wrong, but I think this is what all those folks who keep
> asking about WAV-to-MID conversion expect and THAT isn't what you
> deliver. If I want something that can parrot back some sound I can
> buy a little handheld device from Radio Shack that will record a
> minute or so of sound into a chip and play it back. But what I want

Does this device change the TIMBRE of you voice? Can you playback
your voice as flute?

> of notes and a slide for life with the pitch wheel? The things you
> say above can be done on any decent sequencer don't work worth a damn
> on notes sustained for several measures while being modified by a
> zillion pitch wheel events. It's an absolute bitch to edit that mess
> using my favorite editing tools. Yes, the files Digital Ear produces
> are essentially uneditable!
>
> It just struck me! This is why you limit your demo version of the
> software to a two-second conversion, isn't it! Anything longer than
> two seconds and this shortcoming stands out like a sore thumb!

Read our previous post about this...

> It's scary alright! What it is is the "Dancing Pig" syndrome. They
> aren't looking closely at how well your pig is dancing; they're just
> swept away by the fact that it dances at all. What Digital Ear does
> *IS* close to amazing (or scary), but once you get over the initial
> reaction and start listening critically it breaks down quickly. And,
> as I said, it's uneditable for all practical purposes.

Maybe for YOUR purpose. By they way wh

> Also - please note that in the experience of many the magazines aren't
> always a good source of unbiased information. As an extreme example,
> do you remember the review of those little adhesive backed disks that
> were said to completely change the acoustical characteristics of a
> room when stuck on the walls? We're talking a couple dollars worth of
> adhesive backed disks being sold for over a hundred dollars with the
> claim they could tailor the acoustic characteristics of your room to
> improve the sound of your sound system. It was a total rip-off and a
> major stereo magazine gave the things an unqualified thumbs up!
> Several of the stereo magazines also gave thumbs up to connecting
> cables with gold plated connectors - saying they could hear a
> diference in subjective listening tests. Baloney!


>
> >Do you work for a competitor of Digital Ear :-) ?
>

> I thought you had no real competitors? ;-) But seriously - I've

(Notice the smiley after our phrase)

Right :-) No REAL competitors. But there are still companies
that try to sell their pitch to MIDI products and they dont event
come close to Digital Ear

> Do you really think a competitor would have updated the link to your
> website in the AMM FAQ this morning, at your request, within 5 minutes
> of receiving that request?

We are talking about the negative comments you ALWAYS make
within minutes of our every single post.

If you are so negative about Digital Ear then how negative you
should be for other software that it is much worse (and costs more?)

Why dont you post any negative comments about other programs?

> But as long as we're questioning credibility... if your software is
> so good, where are all the satisfied users? I see one or more queries
> nearly every day in alt.music.midi alone asking how to do WAV-to-MID
> conversion and I NEVER see anyone touting ANY software as a solution
> to this problem except the vendors themselves.

Not true. People keep recomending Digital Ear all the time.

> Have you extended your demo version to produce a result more than a
> few seconds long? If not, why not? You have been told by several
> many times that it is unusably short.
>Are you afraid to extend it to
> 20 seconds or so so evaluators can actually get a feel for what it
> does? Won't the software stand up to a good close look and a rigorous
> test? I answered all your questions, now you answer mine!

It is enough to see that it works, and NOT enough for everyday
studio use. Why? because it is a Demo.

20 seconds are too long. You can split a file in 2-3 pieces and
convert it FREE. Whats wrong with that? Read below:

In the theoretical case that Digital Ear sales stop, then we will stop
the development and the improvement of the software. We give
completely free upgrades to our customers and no other
company does that.

If you feel that Digital Ear is better than any other software then you
should support it so it will become even better in the future.


Regards,

Nick White

unread,
May 25, 2001, 7:27:02 PM5/25/01
to

alright I will maybe give it a try.

Epinoisis Software

unread,
May 25, 2001, 9:21:30 PM5/25/01
to
To summarize our positions:

1. Digital Ear is a new product (1 year and a half, and
not years as you said) that converts monophonic
audio into a MIDI file. We are pretty confident that it
does it in the best way possible compared with any other
software today.

It provides a cost effective solution to convert solos into
midi.

Note that hardware controlers like ZETA (violin to MIDI)
cost several thousands of dollars to do the exactly same
task as Digital Ear which costs $80. Just listen to the
excelent violin conversion that we had.

That fact alone, give a perfect reason Digital Ear to exist
in the market.

2. To have an idea of the conversion quality we have
authentic audio demos in our website at:
http://www.digital-ear.com/midi.htm

Most people love them (a very few like you don't...)

We are always open to users that they want to evaluate a
particular wav file before purchasing.

About our physical location:

Epinoisis Software is located in Athens, Greece.
People of our team are in the U.S. and Europe as well.

As you know, our ISP and newserver provider can be
anywhere in the world.

Anyone can use *any identity* to post whatever he/she wishes.

We are not responsible for postings of everyone in the
world, and we have no way to prove our identity on the
web (maybe we should start using electronic signatures
in the future.) So please do not use these as an argument
to diminish the value of our software.

Best Regards,

Nick White

unread,
May 25, 2001, 11:43:41 PM5/25/01
to
Leave it Jim! As they have said, the demo is there for people to
judge. They will not alter its length. The discussion is really
getting nowhere.

In the end if someone buys that software on the basis of that demo
then good luck to them!

I think that your policy of pointing people to your FAQ, with cautions
as usual is probably more effective and far less stressful than this
head-butting.

Epinoisis Software

unread,
May 26, 2001, 9:18:41 AM5/26/01
to
> I think you may be exaggerating here. I've not seen ZETA violin to
> MId, but I've seen MIDI guitar pickups for considerably under several

For your information ZETA (violin-to-MIDI) costs $2,495 USD!
(Two thousand, four hudrend and ninety-five dollars! )

Compare this price tag with Digital Ear's which is $79.95!
We all know that violin music is mostly monophonic except in
rare cases that the violin can play two strings at once. So in most
cases you can achive the same (or better) result saving orders of
magnitude of money.

Why this? Because Digital Ear is SOFTWARE. There are no
manufacturing costs, no hardware to pay. Besides you
get FREE upgrades, and you can using not only for violin
but in a plethora of other musical instruments as well.

> thousand dollars and they do a very good job of creating MIDI data
> that can be manipulated with sequencing software afterward. Why?
> Because they do what Digital Ear CANNOT, repeat CANNOT, do. They
> convert each string separately! They are effectively polyphonic!

Do you know any brand names? We do not know anything under $1000.
What is your experience with those controlers?

Besides (READ THIS) most of those controller (e.g. The Yamaha GR-50)
uses EXACTLY THE SAME PITCH BEND MODEL AS DIGITAL EAR
that so much your critisized.

AXON (Guitar-to-MIDI controller) AX-100 (List price: $1000 USD)

Listen to their samples at:
http://www.musicindustries.com/axon/compare.htm

Do you think that its is any better than our samples at:
http://www.digital-ear.com/midi.htm

>
> >That fact alone, give a perfect reason Digital Ear to exist
> >in the market.
>

> No one is arguing that Digital Ear should not exist.


>
> >2. To have an idea of the conversion quality we have
> >authentic audio demos in our website at:
> >http://www.digital-ear.com/midi.htm

> do this to make the software look better. As I said, I found 6 or 7
> note events out of several thousand total events in one of your files.
> The rest were similar. I can't make a meaningful edit of that sort of
> mess. I am not impressed.

Its OK. It seems that you pay more attention to the internal format
of the MIDI file, than to the AUDIO result itself. We have a different
perspective on that (and othe companies as well.)


> claims you make for the software. It does not perform as you suggest
> it should. This may be because it can't, and it may be because the

No, it does. And you can easyly verify that if you split a .wav file into
several pieces. Its a pain, we agree, but it is a DEMO.

It is your right to dislike the demo but you cannot really say that
the software does not work as advertized.

Let us repeat once more that anyone can post from any part of
the world with any identity and name. Thats the nature of USENET.

Universities have public computers in the labs than anyone can
post with any name he/she wants. Some people do not even bother
to change the email address.

By the way we have customers from Harvard University (they use it
for a innovative query-by-huming system) , University of Pennsylvania
and Stanford University as well.

By the way, we get very usefull feedback and commends from them.

If you really care to get feedback from Digital Ear users you can
post in our forum (that some of the customers are there.) and
get many opinions by people that actually USE the software.

From our part, we will end the discussion here, we simply
do not feel arguing any further about the proven value of
our software with one (1) person.

Epinoisis Software

unread,
May 26, 2001, 10:10:06 AM5/26/01
to
> I think you may be exaggerating here. I've not seen ZETA violin to
> MId, but I've seen MIDI guitar pickups for considerably under several

For your information ZETA (violin-to-MIDI) costs $2,495 USD!

(Two thousand, four hundred and ninety-five dollars! )

Compare this price tag with Digital Ear's which is $79.95!
We all know that violin music is mostly monophonic except in
rare cases that the violin can play two strings at once. So in most

cases you can achieve the same (or better) result saving orders of
magnitude of money.

Why this? Because Digital Ear is SOFTWARE. There are no
manufacturing costs, no hardware to pay. Besides you
get FREE upgrades, and you can using not only for violin
but in a plethora of other musical instruments as well.

> thousand dollars and they do a very good job of creating MIDI data
> that can be manipulated with sequencing software afterward. Why?
> Because they do what Digital Ear CANNOT, repeat CANNOT, do. They
> convert each string separately! They are effectively polyphonic!

Do you know any brand names? We do not know anything under $1000.

What is your experience with those controllers?

Besides (READ THIS) most of those controller (e.g. The Yamaha GR-50)
uses EXACTLY THE SAME PITCH BEND MODEL AS DIGITAL EAR

that so much your criticized.

AXON (Guitar-to-MIDI controller) AX-100 (List price: $1000 USD)

Do you think that its is any better than our samples at:
http://www.digital-ear.com/midi.htm

>


> >That fact alone, give a perfect reason Digital Ear to exist
> >in the market.
>

> No one is arguing that Digital Ear should not exist.
>

> >2. To have an idea of the conversion quality we have
> >authentic audio demos in our website at:
> >http://www.digital-ear.com/midi.htm

> do this to make the software look better. As I said, I found 6 or 7


> note events out of several thousand total events in one of your files.
> The rest were similar. I can't make a meaningful edit of that sort of
> mess. I am not impressed.

Its OK. It seems that you pay more attention to the internal format
of the MIDI file, than to the AUDIO result itself. We have a different

perspective on that (and other companies as well.)


> claims you make for the software. It does not perform as you suggest
> it should. This may be because it can't, and it may be because the

No, it does. And you can easily verify that if you split a .wav file into


several pieces. Its a pain, we agree, but it is a DEMO.

It is your right to dislike the demo but you cannot really say that

the software does not work as advertised.

Let us repeat once more that anyone can post from any part of

the world with any identity and name. That's the nature of USENET.

Universities have public computers in the labs than anyone can
post with any name he/she wants. Some people do not even bother
to change the email address.

By the way we have customers from Harvard University (they use it

for a innovative query-by-humming system) , University of Pennsylvania


and Stanford University as well.

By the way, we get very useful feedback and commends from them.

If you really care to get feedback from Digital Ear users you can
post in our forum (that some of the customers are there.) and
get many opinions by people that actually USE the software.

From our part, we will end the discussion here, we simply
do not feel arguing any further about the proven value of
our software with one (1) person.

Best Regards,

Justin Pearson

unread,
May 27, 2001, 11:00:54 AM5/27/01
to
"Epinoisis Software" <in...@digital-ear.com> writes:

> Compare this price tag with Digital Ear's which is $79.95!
> We all know that violin music is mostly monophonic except in
> rare cases that the violin can play two strings at once. So in most

You don't listen to (or even play) much violin music do you?

>
> __________________________________________________________________________
>
> Epinoisis Software, the Digital Ear Team
>

good luck, maybe you'll make some money.


--
Justin Pearson - Uppsala Sweden http://www.docs.uu.se/~justin

Mark Brown M

unread,
May 27, 2001, 12:36:13 PM5/27/01
to
> "Epinoisis Software" <in...@digital-ear.com> writes:
>
> > Compare this price tag with Digital Ear's which is $79.95!
> > We all know that violin music is mostly monophonic except in
> > rare cases that the violin can play two strings at once. So in most
>
> You don't listen to (or even play) much violin music do you?

It happens to be myself a classical violin and cello player
I love technology and I use Digital Ear as well.

My experiences with Digital Ear are excellent. I believe that
pushes MIDI capability to its extreme.

Double stops occur when two adjacent strings are played simultaneously.
Fifths and Sixths are relative easy, as are consecutive or parallel
fifths. Seconds, thirds fourths and augumented fourths are difficult
unless the upper note is an open string. Octaves and sevenths that have
an open string as lower note are also easy. You can RARELY get a three note
chord by combining a double stop with the note with the adjacent string.

In conclusion you can very easly emulate parallel fiths by inserting just an
extra note on event in the the Digital Ear-generated MIDI file.

Most of my work need solos without chords. When ever I need chords,
I just tweak the MIDI file in Cakewalk.

Without Digital Ear I could never have realistic violin tracks in
my composistions.

Just my 2c

------------------------------------------------------------------------
Mark Brown M

Nick Thomson

unread,
May 28, 2001, 10:29:06 AM5/28/01
to
"Epinoisis Software" <in...@digital-ear.com> wrote in message news:<9ekcdc$n08$1...@netnews.upenn.edu>...

> > And no one other than Epinoisis who will stand up and say he can get
> > this sort of performance from the software.
>
> Yes, because only our software can do this. Why shouldn't we say that?
>

I totally agree. Digital Ear is the best audio to MIDI software
around. There is no other software or even hardware solution
that can accurately track a big variety of different voices and
instruments (converting to MIDI a single instrument is an easy task)

I agree that they need to make a longer demo so the capabilities of
the software could be demonstrated better (they would also avoid
all those guys bitching about this all the time)

What other software cannot do, however, is capture all the depth
of a live performance to MIDI. Digital Ear outputs three separate
streams of events (pitch,volume,brightness) and gives in many cases
oustanding results. Thats the big strength of Digital Ear.

Ioannis

unread,
Jun 2, 2001, 6:25:21 PM6/2/01
to
Epinoisis Software wrote:
[snip]

> Listen to their samples at:
> http://www.musicindustries.com/axon/compare.htm
>
> Do you think that its is any better than our samples at:
> http://www.digital-ear.com/midi.htm
[snip]

Before I start arguing in this thread, I want to compare for myself. Do
you by any chance have a page where a representative pair (WAV and MID)
are available for download? It would help if the .WAV file was neither
too small nor too large.

It would help a lot, as some of us use Macs and cannot use the demo.

Thanks
--
Ioannis Galidakis
<http://users.forthnet.gr/ath/jgal/>
_______________________________________________
"The traits of the good scientist: good command
of logic and _excellent_ command of insanity."

Ioannis

unread,
Jun 2, 2001, 7:52:26 PM6/2/01
to
Epinoisis Software wrote:

[snip]

> Why this? Because Digital Ear is SOFTWARE. There are no
> manufacturing costs, no hardware to pay. Besides you
> get FREE upgrades, and you can using not only for violin
> but in a plethora of other musical instruments as well.

[snip]

> Listen to their samples at:
> http://www.musicindustries.com/axon/compare.htm
>
> Do you think that its is any better than our samples at:
> http://www.digital-ear.com/midi.htm

[snip]

I have downloaded most of the samples and compared all the pairs. While
not all sound convincing, it is an decent job and deserves credit. What
some of the guys who have argued against it here don't realize, is that
whatever this program does, its true engine probably belongs to the
realms of AI, therefore one can _very_ easily argue against its true
efficiency.

Converting a soundform to MIDI is essentially impossible for software
alone. Again, let me emphasize the word "for software alone".

One easy way to prove that mathematically, would be to pick the
converted result MIDI file and re-generate the wav out of it, do a
simple fourier transform on the re-generated result and compare the
spectrum of the re-generated result with that of the original wav.

It can be easily seen that there is no way to have not only identical
spectra, but even spectra that have very high similarity. In fact, any
software that manages to even _approximate_ the regenerated file's
spectrum to that of the original spectrum, would be pretty high level
tech.

As such, the arguments of these guys who actually claim that the
software "does not deliver what it promises" on the basis of
"truthfulness in reproduction" are actually moot, because this is
implied to a certain extent to begin with.

On the other hand, the authors of the software itself have to understand
that even though the idea is indeed revolutionary, they must abstain
from making extravaggant claims, even to a small extent. Adjectives such
as "Amazing" or "incredible" depend on many wild factors and are, at
best, subjective in these cases.

If I was the author of the software, I'd even avoid the description "wav
to mid conversion" or "converter". There can be no such thing. What the
guys at Epinoisis have there, is an honest "attempt" to approximate
conversion and as such, they have to be VERY careful with how they word
their software claims.

Traditionally AI software can NEVER make extravaggant claims, because of
its own nature. OMR and OCR for example, always leave a certain margin
of error with regard to effectiveness, which can range from very high
and unusable to low and decent, but NEVER perfect.

So, if I was the author of Digital Ear, I'd _definately_ hire a
professional advertiser and pay him mucho bucks to spend 4-6 months
choosing the right wording for the software adds and claims.

Just my 2c of friendly advice.

> From our part, we will end the discussion here, we simply
> do not feel arguing any further about the proven value of
> our software with one (1) person.

A wise choice.

Keep up the good work. What you have done is impressive. And always
remember Murphy's Law as it's applied to cases of advanced tech:

The "first" reaction to a novel idea:
It's clearly impossible!
The "second" reaction to a novel idea:
It _could_ work if enough money and effort was put to it.
The "third" reaction to a novel idea:
I always said it was possible.

:*)

Regards,

> Best Regards,
> __________________________________________________________________________
>
> Epinoisis Software, the Digital Ear Team
>
> E-mail : in...@digital-ear.com
> Web: http://digital-ear.com/
> Fax: +1-(360)-237-6361
> Voice Mail: +1-(360)-237-6361
> __________________________________________________________________________

--

0 new messages