You can find Replit’s description of the problem here.
A live REPL with my solution and few test cases:
This is one of those programming problems that I’ve encountered at very different stages of my maturity as a programmer.
The first attempt I could not solve it in the allotted time I gave myself.
The most recent time, I savored the experience of solving it like a finger of rare scotch.
I usually find computing problems involving language to be fascinating so I want to document how I solved it.
I’ll be using JavaScript because I like it.
(for those too lazy to click the link)
You are given three strings. By strictly following the JSON-formatted instructions (in order) found in the third string, you should be able to transform the first string into the second string.
The keyword is should.
Sometimes, that won’t be possible (without breaking the rules in the JSON instruction string, anyway). Each instruction/rule in the JSON-formatted string is one of these three kinds:
Your goal is to write a function to determine, when given a set of three such strings, if it is or is not possible.
If you return true, it means following the rules in string three gets you from string one to string two.
If you return false, it means the string produced by following the rules is not the same as the second string.
First, a potential pitfall.
You may have an inclination that this can be done with regex — or as the kids are calling it, the ’gex.
Hey Mack, you down with the ’gex yet? — some kid in my neighborhood
It’s not wrong in itself to use regex to solve this problem.
What’s wrong is to think it’s possible to write a function that can translate any combo of the three kinds of instructions into a single master regex to solve the problem, even if you try to be clever and use a regex with a replacer function.
If it were only skips and deletes you had to worry about (no inserts) then definitely that would be possible.
But the presence of inserts makes this problem tricky enough that you have to be smarter than a regex replacer function.
I think that’s one reason they picked this problem. It has enough language recognition and manipulation aspects to it to make you not just reach for regex, but to do something interesting with it.
The first step is to convert the JSON string into an array of proper JavaScript objects.
function isValid(stale, latest, otjson) {
const ops = JSON.parse(
otjson
) }
Next, what we’re going to need is a regex for each instruction. But, to clarify, we actually need one tailored regex for each instruction defined at the time we need to run the instruction.
This means each of the instructions needs to become a function definition which has as parameters:
These function definitions will also each have closure over the variables holding the instruction data, like string to insert or number of characters to skip.
Using all that information, the functions will construct a regex pattern on the fly and apply it to the latest document appropriately.
It is not required to use regex to solve this problem. A few of you may have breathed a sigh of relief after reading that… Truth is there are pros and cons to opting for the ’gex.
Pros:
Cons:
But once you’ve solved a problem with regex, if you want to go back and change implementation to something faster it’s always possible. We won’t do that in this article.
This is the easiest one to do without regex since it doesn’t mutate the string at all. But let’s use it anyway just to be a little bit extra, also because the regex engine is probably better than I am at avoiding off-by-one errors.
Here we use the regex test
function since we don’t want
to change the input string. Really we just want to verify skipping ahead
by the designated number of characters is valid and doesn’t send us
careening off the edge of the cliff.
(,
doc
pos=> [
) ,
doc+ ot.count,
pos new RegExp(
`^[\\s\\S]{${
typeof pos === "number" ? (
pos: (
) => { throw "naughty" }
()
)()}}[\\s\\S]{${
typeof ot.count === "number" ? (
.count
ot: (
) => { throw "naughty" }
()
)()}}[\\s\\S]*$`
.test(
)
doc
) ]
Here the function takes the latest version of doc (as given by last instruction) and latest cursor position (an integer). It returns an array with values which will be required by next instruction or the final comparison.
The first two array entries are obvious, the unchanged document and the new position. Don’t worry that the new position can potentially be invalid; we will actually use the return value of the regex test to determine if we can continue to next instruction.
The regex is made by passing a pattern string to the constructor, where the pattern string is a template string.
The components of the pattern explained in sequence:
^
means the pattern must start matching from the start
of the document[\\s\\S]
is a character class; its actual syntax is
[\s\S]
but we must escape the backslashes in this kind of
string. A character class is a set of characters or character types
which is used to test a character in a string; if the character tested
matches one of the characters or character types, the pattern may
continue matching. This is a conventional character class in regex
world, including two types of characters: the set of all whitespace
characters, and the set of all non-whitespace characters. It’s a handy
way of saying, this class should match any possible character.{${typeof pos === "number" ? ( pos ) : ( () => { throw "naughty" } )()}}
- this part qualifies the character class preceding it (last bullet
point) and indicates the exact number of characters to match with the
character class. If you want to match 5, it should look like
{5}
. So the inner part of this section, which is everything
encapsulated by ${}
is a javascript expression which
results in a number (if not it throws). That javascript expression is a
ternary; the “else” portion of ternary is an immediately invoked
function expression (IIFE) since we can’t put throw
calls
at top level in ternaries.[\\s\\S]{${ typeof ot.count === "number" ? ( ot.count ) : ( () => { throw "naughty" } )() }}[\\s\\S]*
- we have a character class we saw before which matches any character a
number of times specified by ot.count
variable so long as
it is in fact a number. After that, we have another instance of the same
character class succeeded by an asterisk *
; this means 0 or
more of any character.$
means the pattern must match all the way to the end
of the document
(,
doc
pos=> {
) const rs = new RegExp(
`^([\\s\\S]{${
typeof pos === "number" ? (
pos: (
) => { throw "naughty" }
()
)()}})[\\s\\S]{${
typeof ot.count === "number" ? (
.count
ot: (
) => { throw "naughty" }
()
)()}}([\\s\\S]*)$`
.exec(
)
doc
)return rs instanceof Array && rs.length > 1 ? [
.slice(
rs1
.join(
)""
,
),
postrue
: [
] ,
rs,
posfalse
] }
Here the general strategy is to recognize the part of the string before what gets deleted (if any), the part of the string that comes after what gets deleted (if any), and then mash those two parts together.
The regex exec
function is used since we want to capture
or extract some of the string (as opposed to just checking if the string
matches the pattern).
The pattern itself is very similar to the skip pattern. However, note that the first character and its suffix, which indicates the number of characters to match, are surrounded by parenthesis. This is the capturing aspect of the pattern. Same is true for the final character class and its trailing asterisk.
At long last, our boss battle — the string regex replace function.
(,
doc
pos=> [
) .replace(
docnew RegExp(
`^(?<prefix>[\\s\\S]{${
typeof pos === "number" ? (
pos: (
) => { throw "naughty" }
()
)()}})(?<suffix>[\\s\\S]*)$`
,
)
(,
match...args
=> {
) const {
,
prefix
suffix= args[
} .length - 1
args
]const rs = []
if(
prefix.push(
) rs
prefix
).push(
rs.chars
ot
)if(
suffix.push(
) rs
suffix
)return rs.join(
""
)
},
)+ ot.chars.length,
pos true
]
This function, somewhat unsatisfyingly, is not a method of the regex object in javascript, but the string. No matter.
We pass as the first argument a newly created regex. The pattern uses
capturing groups as the delete command does, but these are named. The
syntax is only a bit different; the prefix of the contents inside the
parentheses is like ?<NAME>
where NAME
becomes a property name you can use in the callback (second argument) to
access any text captured by the pattern in the parentheses.
Our capturing groups are the prefix
of the string
leading up to the point of insertion and the suffix
of the
string, or remainder. So we are just splitting the string at the point
of insertion.
Then, we take the prefix, characters to be inserted, and suffix and join them, ensuring to account for cases where prefix and suffix don’t capture anything.
Here’s the final implementation, including the wrapping ternary statement, final check to determine if we produced correct string or not, and the test cases provided by replit:
const start = Date.now()
console.log(
`start → ${start}`
)function isValid(stale, latest, otjson) {
// this is the part you will write!
const ops = JSON.parse(
otjson.map(
)=> ot.op === "skip" ? (
ot
(,
doc
pos=> [
) ,
doc+ ot.count,
pos new RegExp(
`^[\\s\\S]{${
typeof pos === "number" ? (
pos: (
) => { throw "naughty" }
()
)()}}[\\s\\S]{${
typeof ot.count === "number" ? (
.count
ot: (
) => { throw "naughty" }
()
)()}}[\\s\\S]*$`
.test(
)
doc
)
]: (
) .op === "delete" ? (
ot
(,
doc
pos=> {
) const rs = new RegExp(
`^([\\s\\S]{${
typeof pos === "number" ? (
pos: (
) => { throw "naughty" }
()
)()}})[\\s\\S]{${
typeof ot.count === "number" ? (
.count
ot: (
) => { throw "naughty" }
()
)()}}([\\s\\S]*)$`
.exec(
)
doc
)return rs instanceof Array && rs.length > 1 ? [
.slice(
rs1
.join(
)""
,
),
postrue
: [
] ,
rs,
posfalse
]
}: (
)
(,
doc
pos=> [
) .replace(
docnew RegExp(
`^(?<prefix>[\\s\\S]{${
typeof pos === "number" ? (
pos: (
) => { throw "naughty" }
()
)()}})(?<suffix>[\\s\\S]*)$`
,
)
(,
match...args
=> {
) const {
,
prefix
suffix= args[
} .length - 1
args
]const rs = []
if(
prefix.push(
) rs
prefix
).push(
rs.chars
ot
)if(
suffix.push(
) rs
suffix
)return rs.join(
""
)
},
)+ ot.chars.length,
pos true
]
)
)
)try {
const result = ops.reduce(
(,
acc,
cur
idx=> {
) if(!acc[2]) throw "fail"
else return cur(
0],
acc[1]
acc[
),
}
[,
stale0,
true
]
)if(!result[2]) return false
else return result[0] === latest
catch(e){
} console.log(e)
return false
}
}console.log(isValid(
'Repl.it uses operational transformations to keep everyone in a multiplayer repl in sync.',
'Repl.it uses operational transformations.',
'[{"op": "skip", "count": 40}, {"op": "delete", "count": 47}]'
=== true); // true
)
console.log(isValid(
'Repl.it uses operational transformations to keep everyone in a multiplayer repl in sync.',
'Repl.it uses operational transformations.',
'[{"op": "skip", "count": 45}, {"op": "delete", "count": 47}]'
=== false); // false, delete past end
)
console.log(isValid(
'Repl.it uses operational transformations to keep everyone in a multiplayer repl in sync.',
'Repl.it uses operational transformations.',
'[{"op": "skip", "count": 40}, {"op": "delete", "count": 47}, {"op": "skip", "count": 2}]'
=== false); // false, skip past end
) console.log(isValid(
'Repl.it uses operational transformations to keep everyone in a multiplayer repl in sync.',
'We use operational transformations to keep everyone in a multiplayer repl in sync.',
'[{"op": "delete", "count": 7}, {"op": "insert", "chars": "We"}, {"op": "skip", "count": 4}, {"op": "delete", "count": 1}]'
=== true); // true
) console.log(isValid(
'Repl.it uses operational transformations to keep everyone in a multiplayer repl in sync.',
'We can use operational transformations to keep everyone in a multiplayer repl in sync.',
'[{"op": "delete", "count": 7}, {"op": "insert", "chars": "We"}, {"op": "skip", "count": 4}, {"op": "delete", "count": 1}]'
=== false); // false
)
console.log(isValid(
'Repl.it uses operational transformations to keep everyone in a multiplayer repl in sync.',
'Repl.it uses operational transformations to keep everyone in a multiplayer repl in sync.',
'[]'
=== true); // true
)
const end = Date.now()
console.log(
`end → ${end}`
)console.log(
`diff → ${end - start}`
)
Maybe next time we can ask ghostwriter to generate more test cases :)