Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Source maps and fuzzy search

6 views
Skip to first unread message

Eddy Bruel

unread,
Feb 19, 2015, 4:37:36 AM2/19/15
to Nick Fitzgerald, James Long, dev-developer-tools
Hi Nick, James,

Apologies for the huge wall of text. I wanted to provide some context for
what I am doing and why, as well as structure my own thoughts, before
describing the actual problem.

TLDR:

I recently changed the fuzzy search algorithm in the source map library to
be
biased to the right. This fixes an issue we had where setting a column
breakpoint on the first column of the an indented line would end up at the
previous line.

However, this change broke computing the original location of a breakpoint
for which we do not have an exact mapping. In this case, the fuzzy search
algorithm should probably be biased to the left.

Since this stuff is so complex, and because I messed it up the first time
around, I wanted to discuss it and see if the new approach makes sense to
you
before I proceed.

You’ll find a full explanation of what’s going on here below. Thanks for
your time!

WALL OF TEXT:

A source map defines one or more mappings from a position in an original
source to a position in the generated source. By assuming that the mappings
are contiguous (i.e. given two consecutive mappings, the first mapping ends
where the second one starts) we can think of these as mappings from a
position in an original source to a *range* of positions in the generated
source.

Consider the following (contrived) example for a fictional language, where
the call to g in the original source is expanded into calls to g1() and
g2()
in the generated source. In this case:

Original source:
f(); g(); h();
^
Setting a column breakpoint here

Generated source:
f(); g1(); g2(); h();
|---------|
Corresponds to setting a breakpoint here

This is not what we do at the moment. Currently we assume that each
position in the original source corresponds to a single position (rather
than a range of positions) in the generated source, so we only end up
setting a breakpoint
at the call to g1. This means that column breakpoints are not handled
properly in source mapped sources.

Let’s continue, using the same example. Suppose we set a line breakpoint in
the original source. In this case, no column is provided, so it defaults to
Infinity. We don’t have an exact match for this column, so instead the
source map library uses fuzzy search to find the closest mapping on the
given line.

However, since the fuzzy search algorithm always returns the mapping with
the *smallest* position that is greater than the one we are looking for, we
end up setting the breakpoint at the call to h. In other words, line
breakpoints are not handled properly in source mapped sources either.

To fix column breakpoints, we need to be able to set breakpoints on column
spans, rather than a single column. To fix line breakpoints, we need to be
able to find all mappings for a given line. I recently landed a few changes
to the source map library (computeColumnSpans and getAllGeneratedPositions)
that should allow us to implement this.

Things get more complicated when you add fuzzy search to the mix. Consider
the following example:

Original source:
f();
g();
^
Setting a column breakpoint here (on the indented line)

Generated source:
f(); g();
|--|
Corresponds to setting a breakpoint here

In this case, most source maps don’t define a mapping for the first column
of the second line, so we don’t have an exact match. However, since the
fuzzy search algorithm always returns the mapping with the *smallest*
position that is greater than the one we are looking for, we end up finding
the mapping for the call to f, not the one to g, so the breakpoint ends up
in the wrong place.

I recently landed a patch in the source map library that chances the fuzzy
search algorithm, so that it always finds the *greatest* position that is
less than the one we are looking for, in order to fix the above scenario.

However, as it turns out, things get even *more* complicated. Some of our
tests have generated sources that look like this:

g();
|-------|

Note that the mapping starts at column 1, not at column 6, where the call
to g is located. Now suppose we set a breakpoint at the call to g, and it
ends up getting hit. We need to report the original location of the
breakpoint to the client, so we try to find a mapping for column 6. We
don’t have an exact match for this column, so we apply the fuzzy search
algorithm. However, the fuzzy search algorithm tries to find the mapping
with the greatest position that is less than the one we are looking for,
which is *not* what we want here.

The core problem here is that there is nothing to guarantee that mappings
have to start at column 1 for generated lines, so we need to be able to
handle both the above case and the case where the mapping looks like this:

g();
|--|

What I *think* we should do here is the following. When mapping from
generated to original sources, we should think about mappings in terms of
column spans. In the above example, the call to g falls within the column
span of the mapping that starts at column 1, so the fuzzy search algorithm
should return the mapping with the smallest position that is greater than
the
one we are looking for. This should never cause us to end up on the previous
line, since breakpoints never hit on columns that contain whitespace.

Going the other way around, however, its perfectly possible for the user to
set a (column) breakpoint on a column that happens to contain whitespace.
Consider the following example:

f(); h();
^
Setting a column breakpoint here

Generated source:
f(); g1(); g2(); h();
|--|
Corresponds to setting a breakpoint here

I think what the user expects here is for the breakpoint to be set on the
call to h, not the call to f, so the fuzzy search algorithm should return
the mapping with the greatest position that is less than the one we are
looking for.
0 new messages