Multiple faces of Scala Iterator trap

This post is about an issue with Scala collections I encountered recently. Although it is purely an issue of Scala standard library, I haven’t faced it for almost 3 years of my Scala development experience.

The consequences of not being familiar with the matter can lead to very painful results: unexpected stack overflow exceptions in your application.

The core

This issue can arise in variety of different forms, so let’s first look at the core: the simplest reproducer. It involves folding over a standard Iterator:

 

As any Scala developer knows from day one, foldLeft is stack safe. But don’t let the feeling of safety trick you here. This code blows up the stack:

 

And, of course, it is not foldLeft. The biggest “WTF?” moment here is that stack overflow is caused by this trivial-looking call:

 

How is that?

It appears, that for Iterator most transformation methods, like map, filter, etc., do not immediately evaluate the resulting collection. Instead, they build another Iterator on top of the original one.

Calling these transformations several times in a row builds a chain of Iterators. Each one delegates in some way to the respective underlying iterator. Let’s look for the evidence in the source:

 

As you can see, hasNext of the new iterator just calls the hasNext of underlying one. At this moment the vulnerability should be pretty clear: it’s possible to build a chain of arbitrary depth, that can eventually implode.

What’s important to note, that it’s not the transformation itself, that blows the stack. To make the iterator crash you have to call something that triggers the chain. It can be something as simple as .toString or .isEmpty. As we will see later, it can make debugging harder.

The faces

In hindsight, this issue looks straightforward and even obvious. However, it can take so many forms and shapes, that it can be a total riddle from the first look. Debugging experience can be very painful, especially when you are not familiar with the issue before it comes to the surface. I had that painful experience, so let’s dive into details.

I guess, this kind of behaviour is fine for Iterator by it’s very nature. Also, one can fairly say, that Iterator is not a frequent citizen of our code bases. I would agree — I can’t recall using it directly. But things are not so simple.

There are places, where underlying Iterator can leak out of well-known standard collections. I, personally, burnt my fingers with mapValues method on standard Map.

Let’s now take it to the next level. The foldLeft reproducer we’ve seen is very focused. Everything is in one place and, thus, quite simple to debug. But with advent of Reactive Programming, various distribution abstractions like actors and streams are used more and more often.

What it means, is that you may have the very same Iterator trap, scattered over multiple stages of your stream. Or some actor can silently build up an iterator chain and then message it to some other actor, which triggers evaluation and blows up the stack.

I my case, an akka-persistence-query stream crashed after several thousands of replayed events. It’s quite mind boggling, when you see a stack overflow error inside the stream stage, where there’re no signs of infinite recursion or complex calculations.

To illustrate, here’s a simplified version of my case. We build an event-sourcing application with CQRS. When streaming events to the query-side, we want to ensure, that each event is processed exactly once by each consumer.

For that purpose, there’s a small Deduplicator stage, which accumulates a map of last processed event numbers. This allows the stage to mark each event as an original or a duplicate separately for each consumer:

 

Although not trivial, the code is reasonably simple. And it is a time bomb.

Notice, that deduplicated value is constructed using mapValues [line 20]. Then, after another mapValues it becomes the stage state for the next stream element [line 26].

So this is effectively the same fold, just an asynchronous one. After some number of events the Iterator chain becomes larger, than our stack can fit in. The bomb is ready.

Last ingredient for the explosion is the trigger. And our innocent Deduplicator guy just passes the trigger down the stream [line 28]. The actual crash happens somewhere in the consumer code, which happens to access the Map.

The fix

One can fairly argue, that Deduplicator is not optimally implemented and can be more efficient. But for the sake of our study, let’s fix it without rewriting.

The solution to an overly long Iterator chain is to break it. First idea that comes to mind is to trigger evaluation of mapValues result immediately.

There are many ways to do that. For example, there are several good suggestions in this SO thread. Developers from Lightbend argued, that .view.force call is the best way to “evaluate” the collection.

Indeed, in our example, calling deduplicated.view.force produces a fresh map, without any ticking time bombs inside. Stack overflow threat is gone.

Conclusion

So this is it. Now any time you’ll see a stack overflow with Iterator involved, you know what to look for.

While debugging this issue, I was lucky to have access to paid Lightbend subscription. The support was incredible, I saved tons of time using their experience and guidance, while looking for the root cause and eliminating it.

That’s all I have today, thanks for reading!

Scala.js In A Big Web Application Talk

My first talk ever at a big IT conference was about my production experience with Scala.js. It nicely concluded a full year of extensive Scala.js frontend development, that my team was doing at Evolution Gaming starting from May 2016.

I tried to lay out various pros and cons of using Scala.js in a big browser SPA, based on that real experience. I also tried to go into details and provide examples for most of the points.

Talk took place on 16.05.2017 at Riga Dev Days conference in Riga, Latvia.

Slides (english, Speakerdeck)
Video (48m, english):

Cross-platform polymorphic date/time values in Scala with type classes

At Evolution Gaming me and Artsiom work on internal scheduling application, that has a huge ScalaJS frontend. We have to deal with lots of complex date/time logic, both on backend and browser sides.

I quickly realised, that sharing same business logic code across platforms would be a massive advantage. But there was a problem: there were (and still is) no truly cross-platform date/time library for Scala/ScalaJS.

After a small research I settled with a type class-based solution that provides cross-platform java.time.* -like values with full TimeZone/DST support. In this post we will:

  • take a quick look at the current state of date/time support in Scala/ScalaJS;
  • see how to get cross-platform date/time values today with the help of type classes.

Described approach works quite well in our application, so I extracted the core idea into a library, called DTC. If you’re a “Gimme the code!” kind of person, I welcome you to check out the repo.

Prerequisites

I assume, that reader is familiar with ScalaJS and how to set up a cross-platform project. Familiarity with type classes is also required.

The Goal

There’s no solution without a goal. Precise goal will also provide correct context for reasonings in this article. So let me state it.

My primary goal is to be able to write cross-platform code that operates on date/time values with full time zone support.

This also means that I will need implementation(s) that behave consistently across JVM and browser. We’re Scala programmers, so let’s choose JVM behaviour semantics as our second goal.

Current state of date/time in Scala and ScalaJS

So we have our goal, let’s see how we can achieve it.

In this section we’ll go over major date/time libraries and split them into 3 groups: JVM-only, JS-only and cross-platform.

JVM-only

We won’t spend too much time here, everything is quite good on JVM side: we have Joda and Java 8 time package. Both are established tools with rich API and time zone support.

But they can’t be used in ScalaJS code.

JS-only

We’re looking at JS libraries, because we can use them in ScalaJS through facades. When it comes to date/time calculations, there are effectively two options for JavaScript: plain JS Date and MomentJS library.

There’re things that are common for both and make them quite problematic to use for our goal:

  • values are mutable;
  • semantics are different from JVM in many places. For example, you can call date.setMonth(15) , and it will give you a same date in March next year!

There’s also a JS-Joda project, which is not so popular in JS world, but has much more value to JVM developers, because it brings Java8 time semantics to Javascript.

JS Date

JS Date is defined by ECMA Script standard and is available out of the box in any JS runtime. But it has several weaknesses. Major ones are:

  • quite poor API;
  • time zone support is not universal: behaviour depends on environment time zone, and you can’t ask for a datetime value in an arbitrary zone.

Since JS Date is a part of language standard, ScalaJS bindings for it are provided by ScalaJS standard library.

MomentJS

Despite MomentJS values are mutable and still have minor bugs in calculations, it’s quite a good library, especially, if you need full time zone support.

It also has a ScalaJS facade.

JS-joda

JS-joda is implementation of a nice idea: porting java.time.* functionality to Javascript. Though I’ve never used this project, it looks like an established and well-maintained library.

ScalaJS facade is also in place, so you can definitely give it a try in your Scala project.

The only problem with regard to our goal is it still lacks proper DST support. But it’s already in progress, so you can expect it to be fully functional in observable future.

Cross-platform date/time libraries

After a small research, I found three options. Let’s see them in detail.

Scala-js-java-time

This library is the future of cross-platform date/time code. It’s effectively Java 8 time, written from scratch for ScalaJS.

At the time of writing this post, scala-js-java-time already provides LocalTime, LocalDate, Duration , and a handful of other really useful java.time.* classes (full list here).

It means, that you can use these classes in cross-compiled code and you won’t get linking errors: in JVM runtime original java.time.* classes will be used, and JS runtime will be backed by scala-js-java-time implementations.

Problem here, is that we need LocalDateTime and ZonedDateTime in ScalaJS. And they are not there yet.

Spoiler: we’ll be using scala-js-java-time in our final solution for the problem.

Scala Java-Time

Scala Java-Time is a fork of ThreeTen backport project. So it’s main purpose is to provide java.time.* -like functionality on Java 6 & 7.

It is also compiled to ScalaJS, which means we can write cross-platform code with it. And we can even use (to some extent) LocalDateTime!

The only problem is it doesn’t support time zones for ScalaJS yet (providing this support is the main focus of the project now, though).

So this library is close, but still misses our goal by a tiny bit.

Soda time

Soda time is a port of Joda to ScalaJS.

It’s in early development stages and also doesn’t have time zones in ScalaJS, but I still added it to the list, because developers took an interesting approach: they are converting original Joda code with ScalaGen.

So the resulting code really smells Java, but I’m still curious of what this can develop into.

Idea? No, the only option

The reason I’ve given the overview of currently available libraries is simple: it makes clear that there’s only one possible solution to the problem.

There’s no cross-platform library with full time zone support. And for JavaScript runtime there’s only MomentJS, that really fits our requirements. All this leaves us with nothing, except following approach:

  1. We define some type class, that provides rich date/time API. It’s a glue that will allow us to write cross-platform code.
  2. All code, that needs to operate date/time values, becomes polymorphic, like this:
  3. We provide platform-dependent type class instances: java.time.* -based for JVM and MomentJS-based for ScalaJS.
  4. We define common behaviour laws to test the instances against. This will guarantee, that behaviour is consistent across platforms.
  5. MomentJS API is powerful, but it has to be sandboxed and shaped to:
    • provide immutable values;
    • provide JVM-like behaviour;
  6. There’s a limitation, that we can’t overcome without some manual implementation: both JS libraries don’t support nano seconds. So we’ll have to live with milliseconds precision.

We won’t go over all of these points in this article. DTC library does the heavy lifting for all of them. In following sections we’ll just glance over the core implementation and example.

DateTime type class

Let’s just take a small subset of java.time.LocalDateTime API and lift it into a generic type class. We’ll use simulacrum, to avoid common boilerplate:

 

First of all, a total order for DateTime values is defined. So we can extend cats.kernel.Order and get all it’s goodies out of the box.

Second, thanks to scala-js-java-time, we can use LocalTime and LocalDate to represent parts of the value. Also, we can use Duration for addition operations.

For now, let’s just view it as a glue for LocalDateTime. We’ll get to time zone support a bit later.

Cross-compiled business logic

Having our new type class, let’s define some “complex” logic, that needs to be shared across JVM and browser.

 

With syntax extensions in place, the code looks quite nice.

More over, you can notice, that nothing here says, if time should be local or zoned. Code is polymorphic, and we can use different kinds of date/time values, depending on the context.

Now let’s get to the flesh and bones: type class instances.

Type class instances

Let’s start with JVM instance, as it’s going to be quite simple. Follow comments in code for details.

 

With MomentJS it’s going to be much more interesting, because we’ve obliged ourselves to provide values, that are comfortable to work with for a functional programmer.

To enforce immutability, we won’t expose any moment APIs directly. Instead, we’re going to wrap moment values in a simple object, that will be immutable:

 

Several notable things here:

  1. We make both constructor and underlying value private to make sure there’s no way to modify object internals. We’ll provide a custom constructor later.
  2. Notice month value adjustment to provide JVM-like behaviour. You will see much more of such things in DTC, I even had to write a couple of methods from scratch.
  3. To compare two moment values, we use their raw timestamps.

Now it’s trivial to define DateTime instance for our MomentLocalDateTime:

 

Now have everything to run our generic domain logic on both platforms. I’ll leave it as an exercise for my dear reader.

Now let’s discuss some aspects of making this thing work for zoned values as well.

Time Zone support

Not much time is needed to realise, that we need separate type classes for local and zoned values. Reasons are:

  • They obey different laws. For example, you can’t expect a zoned value to have same local time after adding 24h to it.
  • They have different constructor APIs. Zoned value needs time zone parameter to be properly constructed.
  • Zoned values should provide additional APIs for zone conversions.

On the other side, most of querying, addition and modification APIs are common to both kinds of date/time values. And we would like to take advantage of that in cases we don’t really care about a kind of the value and wish to allow using both.

This leads us to following simple hierarchy:

  1. LawlessDateTimeTC (which we initially called DateTime) that contains common methods, specific to all date/time values.
  2. LocalDateTimeTC and ZonedDateTimeTC will extend LawlessDateTimeTC and provide kind-specific methods (constructors, for example).

This -TC suffix is ugly, but name clash in JVM code is worse :).

We will also have to provide a cross-compiled wrapper for time zone identifiers, because java.time.ZoneId is not yet provided by scala-js-java-time, and we don’t really want to pass raw strings around.

Everything else is just an evolution of core idea. Full implementation and more examples are available in the DTC repo.

Note on polymorphism

A side-effect of this solution, is that all your code becomes polymorphic over the specific date/time type. While most of the time you’re going to use single kind of time (zoned or local), there are cases when polymorphism becomes useful.

For example, in an event-sourced system, you may require zoned values for most of the calculations within the domain, as well as commands. But, at the same time, it can be a good idea to store events to journal with UTC values instead.
With type class-based approach, you can use same data structures for both purposes, by just converting between type parameters of different kinds.

Conclusion

Though polymorphic code can look scary for some people, described approach give us following advantages:

  1. Truly cross-platform code, that operates on rich date/time APIs with time zone support.
  2. Polymorphism over specific kind of date/time values.

If you’re working with date/time values in Scala on a daily basis, please, give DTC a try and tell me what you think!

Thanks for reading! 🙂

Published slides from my “Recursive implicit proofs with Shapeless HLists” talk

Recently I was giving an introductory talk about constructing implicit proofs with Shapeless at Scala Riga Meetup.

Contents in brief

  • Analogy between Scala implicit resolution and Mathematical proofs.
  • Recursive implicit proofs (why “recursive” is important feature)
  • HList basics
  • Various implicit proofs for HLists with demos. Mathematical induction as a handy tecnique to writing HList proofs.
  • Expanding HList proof to all product-like types with shapeless.Generic
Slides (english, PDF)
Slides (english, Slideshare)

Implementing type-safe request builder: practical use of HList constraints

Hello, Scala developers!

In this post we will develop a simple type-safe request builder. Along the way, we will:

  • encode domain rules and constraints at compile time with implicits;
  • heavily use HLists;
  • use existing and implement a couple of new HList constraints.

The example in this post is a simplified version of a real request builder from my Scalist project. Implementation of intentionally omitted concepts, like response parsing and effect handling, can be found there.

The code from this article can be found on GitHub.

Prerequisites

I assume that you, Dear Reader, have a basic understanding of:

  • implicit resolution;
  • typeclasses;
  • what an HList is.

If some of those are missing, I encourage you to come back later: there’s a lot of good articles and talks on these topics out there in the web.

The problem

We will be developing a query request builder for and API that is capable of joining together multiple logical requests and returning all results in one single response. A great example is Todoist API: you are allowed to say “Give me projects, labels and reminders”, and the API will return all of them in a single JSON response.

Such an optimization is great from performance perspective. On the other hand, it imposes difficulties for a type-safe client implementation:

  • you need to somehow track the list of requested resources;
  • return value type must reflect this list and not allow to query for something else.

Those problems are not easy to solve at compile time, but, as you might have heard, almost everything is possible with Shapeless 🙂

The task

The task here is to implement a request builder with following features:

  • allows to request single resource like this:
    builder.get[Projects].execute
  • allows to request multiple resources like this:
    builder.get[Projects].and[Labels].and[Comments].execute
  • doesn’t allow requesting duplicates. This will not compile:
    builder.get[Projects].and[Projects]
  • doesn’t allow requesting types that are not resources. This will not compile:
    builder.get[Passwords]
  • execute method will return something that will contain only requested resources and will not allow to even try getting something else out of it.

Simplifications

To focus on the main topic, I will cut off all other aspects of implementing a good HTTP API client, like sending the request, parsing, abstracting over effects and so on.

To keep things simple, let’s make  execute method just ask for an implicit MockResponse typeclass instance, that will supply requested values instead of doing everything mentioned above.

Let’s code!

Model

We’ll start with some simple things, that are required though. First let’s define a model to play with. Just some case classes from task management domain:

 

What we are actually going to request are lists of domain objects. But get[Projects] looks better than get[List[Project]], so let’s define some type aliases. Also, this is a good place to put an instance of our builder:

 

Good. We’ll get to the Builder later. Now let’s define

APIResource typeclass

An instance of APIResource[R] is a marker that R can be requested with our builder.

 

A couple of comments here:

  • In a real case typeclass body will not be empty — it’s a good place to define some specific entity-related properties. A resource identifier, that is used to create a request can be a good example.
  • We mark APIResource as sealed here, because we know all the instances upfront and don’t want library users to create new ones.

Now we’re ready to implement the Builder.

Single resource request

Builder call chain starts with get method — let’s create it:

 

Quite simple for now: we allow the method to be called only for types that have an implicit APIResource instance in scope. It returns a request definition that can be executed or extended further with and method.

We will solve the execution task first. A RequestDefinition trait defines execute method for all request definitions:

 

There’s the MockResponse thing I was talking about. It allows us to avoid implementing all the real machinery that is not relevant for the topic of this post.
Almost everything about this typeclass is simple and straightforward, but there’s an interesting thing that will show up later, so I have to put the implementation here to reference it.

 

Ok, we’re able to execute requests! Let’s see how we can solve the chaining task with and method.

Chaining: 2 resources request

First, we have to decide, value of what type should be returned by an executed multiple resource request. Standard List or Map could do the job in general, but not with our requirements — precious type information will be lost.

This is where HList comes in handy — it’s designed to store several instances of not related types without loss of any information about those types.

Next. When implementing a type-safe API, we have to put all domain constraints into declarations, so that they’re available for the compiler. Let’s write out, what is required to join two resources in one request:

  • both of them must have an APIResource instance in place, and
  • their types must be different

Good, we’re ready to make our first step to a multiple resource request definition:

 

All our constraints are in place:

  • the context bound ensures that original resource type is valid.
  • AR: APIResource[RR] implicit parameter ensures that newly added resource type is valid.
  • NEQ: RR =:!= R implicit ensures that R and RR are different types.

Also, you can notice the HList in the result type parameter. It ensures the execution result to be precisely what we requested at compile time.

Now we’re all set up to dive into the most interesting and complex problem.

Chaining: multiple resources request

Here the task is to append a new resource request R to a list L of already requested ones. Again, let’s first define required constraints in words:

  1. every element of  L must have an APIResource instance in place;
  2. R must have an APIResource instance too;
  3. L must not contain duplicates. Let’s call such a list “distinct”;
  4. L must not already contain R . In another words, result list must be distinct too.

That’s a lot. Let’s see, what implementation we can come up with here.
We will start from the MultipleRequestDefinition class itself:

 

Actually, we already have everything for our first requirement. A LiftAll typeclass from shapeless ensures that every element of an HList has a specified typeclass instance.
In our case, implicit allAR: LiftAll[APIResource, L] constraints L to have an APIResource for each element.

Implicit ID: IsDistinctConstraint[L] will ensure that all elements of L are different (requirement #3). There’s no IsDistinctConstraint in shapeless 2.3.0, so we will have to implement it ourselves. We’ll come to that later.

That’s it for the class definition. Let’s move on to the and combinator:

 

Requirement #2 is trivial here. NotContainsConstraint for requirement #4 will have to be implemented by us too.

All right, so we have two HList constraints to implement. Let’s see how it’s done.

Implementing HList constraint

In general, a constraint is implemented as a typeclass, that provides instances for and only for objects, that meet the constraint.

Most of the time it can be done with a technique similar to mathematical induction. It involves two steps:

  1. Define a base case: implicit constraint instance for HNil or an HList of known length like 1 or 2.
    Base case depends on the nature of the constraint and can involve additional constraints for HList element types.
  2. Define the inductive step: implicit function, that describes how new elements can be added to an arbitrary HList, that already meets the constraint.

We will start with NotContainsConstraint. The typeclass definition is quite straightforward:

 

U  is the type that L must not contain to meet the constraint.

Let’s define the base case. Here it’s simple:

HNil doesn’t contain anything.

In general we want constraints to stay same under any circumstances, so it’s usual to define implicit rules right in the typeclass companion object:

 

Seems logical: for any type U we state that HNil doesn’t contain it. Heading over to inductive step, it can be expressed in words this way:

Given an HList that doesn’t contain U, we can add any non-U element to it and get a new HList, that still doesn’t contain U.

Let’s encode it.

 

Here we require an HList L, that doesn’t contain U (is guaranteed by implicit  ev: L NotContainsConstraint U) and a type T, that is not equal to U( ev2: U =:!= T ). Given those evidences, we can state that L :: T doesn’t contain U. We do it by supplying a new typeclass instance  new NotContainsConstraint[T :: H, U] {} .

Some tests:

 

Nice, it works!
I hope it’s transparent here how implicit resolution can or can not find a constraint instance for an HList: we start with HNil base case and go to the list head. If implicits chain is not broken along the way by a duplicate element — we get a constraint instance for the whole list.

Now we’re going to implement IsDistinctConstraint in a similar manner. And our fresh NotContainsConstraint is going to help us here!

Base case is quite simple:

HNil is a distinct list.

 

Inductive step is quite simple too:

If an HList L is distinct and it doesn’t contain type U, than U :: L is a distinct list too.

 

Tests show that everything works as expected:

Wiring everything up together

Now, when we’ve done all the preparation work, it’s time to get our builder to work.
We’ll try it out in a REPL session. Single request case goes first:

 

Everything is ok, we’re getting the mocks, defined in MockResponse. But surprise awaits us, if we try to get multiple resources:

 

There’s no implicit mock response for our HList! We will have to add some implicits into MockResponse companion to help it join our results:

 

After all those constraint tricks this simple typeclass extension should be transparent to you. We basically supply a MockResponse instance for any combination of MockResponses.

Important note: although the problem we’re solving here looks artificial, it is not — in a real world we will have to propagate requested types through all network & parsing machinery, that obtains the result. It is the only way to keep the compile-time safety.
And, similarly to our example, some tools (probably implicit) will be required for joining several results in an HList.

Finally, we get everything working! Notice the result types and how HList allows to select only requested types.

 

All safety requirements are also met — there’s no room for programmer errors:

Shapeless 2.3.1

Shapeless 2.3.1 is coming out soon, and it will contain both constraints we implemented here.

Conclusion

Creating a library or an API is a great responsibility. Providing end-user with a type-safe interface, that doesn’t allow to run and deploy malformed definitions is a high priority aspect.

HList constraints are a great tool in a Scala API developer’s toolbox. In this post we’ve seen them in action, applied to a practical example.

Thanks for reading! See you in future posts 🙂

ELK Talk @ Qiwi Conf #3

Post about my ELK stack talk at Qiwi Conf #3 on 22.12.2015.

Things covered:

  • ELK components (Elasticsearch, Logstash, Kibana) overview
  • How to deploy ELK
    • quick & simple deployment
    • fault tolerant & scalable deployment
  • Basic Kibana usage
  • ELK based alerts with Elastalert
  • Custom metrics with Kibana dashboards
  • Some demos (not enough video quality to figure out what’s going on though)
Video (russian):

Slides (English):

Custom data validation rules with Shapeless tags

Incoming data validation is a problem that every API developer faces at some point in time. In this small article I’ll show, how shapeless tags can be used to express custom validation rules for Play JSON deserializator.

Full sbt project for this article can be found here.

Prerequisites

I assume the reader is familiar with basic Play JSON converters and combinators. Though it will help, it’s not a necessary knowledge to get the idea.

Some Play JSON basics can be learned here.

The problem

Let’s say our API accepts credit card payments. We define a simple data model for those cards (expiration date is omitted for brevity):

 

We’re going to receive this as a JSON field of incoming request and have to validate the credentials against some rules:

  • card number is 16 to 19 digits after removing all whitespace chars (that’s why we made it a String, not a Long)
  • cvv is 3 to 5 digits

How can we implement this? An experienced API designer would shout: “Those are not strings!”, and introduce some types. Completely valid point, a model like this:

 

would do the job. Single drawback is you’d have to implement custom serialization/deserialization for CVV and CardNumber to stay with simple JSON strings. By default, an instance of this type would serialize like:

 

Anyway, this is still good design. But what if we want them to be  String’s? For any reason, like we’d have much cleaner code that uses this CreditCard class.
Let’s set this as a requirement and see what we can do.

Simple strings will require defining custom  Reads for all types they belong to. Like if we have card number in some other type T, we’d have to duplicate that rule in Reads[T]. We don’t want that.

Here is where tags come nicely into play.

Tags

As a quick intro, a tag is a marker for an existing type that creates a new type with following properties:

  1. Values of the new type can be used as the values of original untagged type.
  2. Values of the original type can’t be treated as tagged ones. Such code won’t compile.

In this article I will use shapeless tags. A simple usage example:

 

Tags implementation is quite concise, you can look through it in the shapeless repo.

Defining a rule for tagged string

Returning to initial problem, here is how we can use tags to define custom validation rules for those credentials.

First, let’s tag our model fields:

 

Now we can define rules for tagged types:

 

Notice, that as we define Reads for a tagged type, we must return the same tagged type. So here is where we tag values.

Doing so allows us to use default Play macros to define Format for CreditCard (and any other type we’d like to put those “custom” strings in):

 

That’s it. Let’s test:

 

So it works! 🙂
We left our strings almost untouched while not losing Play Reads  granularity.

Thanks for reading!

UPDATE. Note on Play route url binders.

A nice catch from Doug Clinton in comments: this trick won’t work with Play route parameter.
I’ll quote Doug:

The problem is that the generated routes file uses classOf[T]  when creating its invoker. classOf  expects a class type, and won’t compile when the parameter type is @@[String, IdTag] , which does not have a runtime class.

Thank you for addition, Doug!

Acknowledgement

I want to say a big “thank you” to Denis Mikhaylov (aka @notxcain) for introducing this concept to me.

How to make an idiomatic Javascript library with Scala.js

UPDATE: Article was updated to Scala.js version 0.6.7, which vastly simplifies Promises related section.

Scala.js opens a big world of frontend development to Scala fans. Most of the time Scala.js project ends up being an independent browser or Node.js application. But there are cases, where you would want to make a library for general frontend developers.

There’re some interesting gotchas in writing Scala.js library such way, that it will be natural to use for an average JS developer. In this article we will develop a simple Scala.js library (code) to work with Github API and will focus on the idiomaticity of it’s JS API.

But first, I’m sure you want to ask

Why would I do that?

Reasonable one.
You should consider developing such a library if:

  1. A client application for your Scala API backend already exists, and it’s native Javascript.

    Sad, you will hardly have a chance to write it from scratch with Scala.js, but at least it makes sense to write a communication / interpretation library for those guys.
    It will simplify interaction between you and frontenders in two ways:

    • You can hide some tricky client-side logic there, and expose much simpler API.
    • Your library can work on model classes, defined in backend project (see Cross-Building). You get typesafe isomorphic code almost for free and can forget about client-server protocol synchronization problems.
  2. You develop a public API for developers, like Facebook’s Parse.

    A perfect solution for a Javascript API SDK. See all the advantages of the previous case.

Recently, I’ve faced the first case. Moreover, our REST-like JSON API has two different browser based clients. So developing an isomorphic library was a logical choice.

Let’s start with our library.

Requirements

  1. As Scala developers we want to write all business logic in familiar functional style, being able to use all the handy Scala features.
  2. Library API must be natural for JS developers.

Setting up project

Such a project doesn’t differ from a regular Scala.js app. If you are new to Scala.js, you can read this tutorial first.

Folder structure:

resources/index-fastopt.html — a page that will just load our library and  resources/demo.js file, that will test the API.

API

The purpose of the library is to simplify Github API interaction. For simplicity, we’ll implement only one feature – loading users and their repos by login.

So it’s, basically, a public method and a pair of model classes, that store results (value objects). Model is the place we’ll start writing code.

Model

Let’s define model classes like this:

Everything is easy: User has some repos, a repo is either an origin or a fork. Good old Scala model. How do we export that to JS developers?

For a full reference of exporting features see Export Scala.js APIs to Javascript

Object creation API

Let’s look at, how we should expose such API. It seems an easy solution to expose the constructor:

But this won’t work. You don’t have Option constructor exported, so there’s no way to create homepage  parameter.

Moreover, there are additional limitation for case classes: You can’t export two case constructors that are under inheritance relationship. This code won’t even compile:

So what is the best choice? I found that it’s best to leave constructors alone and just expose JS-friendly factory methods, like this:

Here with the help of js.UndefOr we handle optional parameter JS way: you can pass a String , or don’t pass anything:

Note on caching Scala objects

Making client call  Github() every time is not the best API option. If you don’t need laziness, you can cache it upon startup:

Reading model properties

Seamless types

If we now try to read fork’s name, we’ll get undefined . Fair enough, it’s not exported. Let’s export model properties.

There’re no problem with native types like String , Boolean and Int . They can be exported as is:

A case class field can be exported with @(JSExport@field) annotation. An example for  forks property, that’s not a member of Repo trait:

Option

But as you already can expect, there’s a problem with
homepage: Option[String] . Well, we can export it, but this would be useless – to get the actual string value JS developer would have to call something on an option, and nothing is exported.

On the other side, we’d like to keep Option in place, so that our Scala code, that manipulates value classes, remains powerful and simple.  js.UndefOr[T] API is way less expressive.

A solution here is to export a special JS-friendly getter method:

Let’s try it out, it works:

We retained our beloved Option monad, and exported nice and clean JS API. Great!

List

User.repos is a List , and has the same problems with being exported. Solution here is the same too: we’ll just export it as a plain JS Array :

Now we can even map them 🙂 :

Sum types

There’s still one problem with  Repo trait. As we’re not exporting constructors, given a  Repo instance, JS developer can’t figure out, what kind of  Repo it is.

In Javascript there’s no pattern matching and using inheritance is not so popular, sometimes even questionable. So we have several options here.

  1. Depending on the context, provide methods like isFork: Boolean or  hasForks: Boolean at the base level. This is perfectly fine, but not general enough.
  2. Add  type: String (or whatever name feels suitable to you) property to all sum types.

I choose the second one, because it can be abstracted and used throughout the whole codebase. Here’s how it can be done. Let’s declare a mixin that exports a type property:

We have to use a different name for scala definition, because it’s a reserved word.
That’s it! We can now mix it in:

… and use it:

To make this a little safer, we can store type names constants, that can be compared with instance type property. This can be done typesafe:

Having this helper class we can define these constants in our Github global for example:

Now we can avoid strings in Javascript! An example:

That’s how we dealt with sum types.

What if I can’t change object, that I want to export?

This is a case if you want to (maybe, partially) export your cross-built model classes or other imported library objects. The solution is the same to Option and List with the only difference: you have to implement JS-friendly replacement classes and conversion yourself.

An important rule here is to use JS replacements only for export ( Scala => JS) and instance creation ( JS => Scala ). All business logic must be implemented with pure Scala classes.

Let’s say you have a Commit class, that you can’t change:

Here what you can do to export it:

Then, for example, a Branch  class, that you own, would look like this:

Since in JS environment commits are represented with CommitJS objects, a factory method for Branch  would be:

Of-course, this workaround is not a beautiful thing, but at least it’s type checked. That’s why I think it’s preferable to view your library not only as a value-classes proxy, but as a facade that hides redundant details and simplifies the API. That way you won’t even need to export the underlying model.

That’s all for exporting model. Let’s move on to the more interesting part – loading the content from Github API.

AJAX

Implementation

For the brevity purposes we will use scalajs-dom Ajax extension as a “network” layer. Let’s for some time forget about how we’re going to export things, let’s just implement the API.

For the simplicity, we’ll put everything AJAX-related into API object. It will have two public methods: for loading user and loading repos.

We will also implement a DTO layer, to decouple API from the model. For type-safe error handling we’ll use Xor type from Cats library. The result type of the method call will be Future[String Xor DTO], where DTO is the type of requested data and String will represent error.

I’ve mentioned everything for this listing to be more understandable, here it is:

Deserialization code is hidden, it’s not interesting. The load method returns string error, if response code is not 200, otherwise it converts the response data to JSON and then to DTO’s.

Now we can convert our API results into model classes.

Here we use a monad transformer to combine these “disjunctioned” futures, and then convert DTO’s into model classes.

Well, that is quite idiomatic functional Scala, lots of pleasure. Now let’s think about how we will export  loadUser method to library users.

Share the Future

To follow the article goals we need to answer the question: what is the idiomatic way to handle asynchronous call in Javascript? I already hear experienced frontenders laughing, because there are no such thing. Callbacks, event emitters, promises, fibers, generators, async/await — all of them are somehow valid approaches. So what should we choose?

I think, the closest thing to Scala Future in Javascript are Promises. Promises are very popular and are already native in most modern browsers. So we’ll stick with them.

First, we must let our Scala code know about those promises. Until Scalajs 0.6.7 we would have to use Promise typed facade from scalajs-dom. But with Scalajs 0.6.7 things became much easier, we will just use the “standard” Promises.

All we have to do now is to convert a Future into Promise. Again, since version 0.6.7 this is not more a problem — there’s a toJSPromise converter in JSConverters . We will just need to help it with the left side of our Xor — convert it to a failed Future to get a rejected Promise:

So let’s share the promise with our JS friends! As usual, we put it to Github object, near the original method:

Here in case of failed future we’re rejecting promise with the exception message. That’s all, we can test the whole API now:

Well, we did it! We can use Futures and everything else we are got used to — and still export idiomatic JS API.

For more API usage examples see full demo.js. To play more with the project, just fetch the repo, then build and run it.

Conclusion

Putting it all together, here are some general advice on writing a Javascript library with Scala.js:

  • Cache exported objects on startup.
  • Export seamless types “as is”.
  • Don’t export Options, Lists and other Scala standard. Put a JS-friendly getter nearby, that converts to  js.UndefOr and js.Array. BTW, same with  Map => js.Dictionary.
  • Don’t export constructors. Use a JS-friendly factory method. JS-friendly means it accepts js.* types and converts it to Scala standard types.
  • Mixin a string type property into sum types.
  • Export Future s as js.Promise s
  • Scala first. You are a Scala developer, so don’t limit yourself in any way: use all the power you like. You know now, that you’ll be able to export it.

Links

Solid Type System vs Runtime Checks & Unit Tests (Scala example)

A month ago I made a talk for my colleagues at our private developers meetup called QIWI Conf. It’s about how Scala type system can help you to release safer code.

Talk is mostly for Scala beginners, yet some patterns covered are very powerful and not offered by majority of other languages.

Contents in brief

  • Introduction
    • What are fail-fast options today?
    • Why compile-time checking is the best choice?
  • Patterns that can lift up your code to be checked at compile time
    • Options
    • Either and scalaz.\/
    • Sealed ADT + pattern matching
    • Tagging
    • Phantom types
    • Path dependent types
Slides (english)
Video (russian):

FPConf Notes

Notable talks from the FPConf conference.

Macros in Scala

A good introduction to scala macros from JetBrains scala plugin developer.

Macros is basically an AST transformation.

Simple macros are implementes as a method invocations:

More complex things are achieved with implicit macros.

Macros can help with:

  • Typeclass generation (including generic implicit macros as a fallback)
  • DSLs
  • Typechecked strings (e.g. format-string)
  • Compiler plugins

IDE support is far from ideal, mostly because of todays scala macro implementation limitations. Coming  Scala.meta to the rescue.

Speaker had a custom IDEA build on his laptop, with a super-awesome “expand macros” feature. Just press a magic shortcut and  examine expanded macro code.
Brilliant thing by JetBrains, can’t wait to use it.

Embedding a language in string interpolator

Great example of custom string interpolator use case.

Mikhail Limansky showed an example of creating MongoDB query language interpreter with a string interpolator.

Most of the times you don’t need such things, for example if you use a good ORM. But speaker’s case is valid. He works with two projects each using different and verbose ORM’s to access a single mongo database. So he made a decision to implement a single expressive (and importantly— well-known) language interpreter, that would generate ORM code of choice.

Even though it’s a plain string interpolator, it’s almost compeletely typesafe, thanks to extensive macros usage. Talk covers every step to implement such a thing.

Here’s the library.
https://github.com/limansky/mongoquery

Frontend with joy (DataScript database)

Slides

Introduction into DataScript database: https://github.com/tonsky/datascript

In a few words — it’s an immutable in-memory database for browser js applications.
Along with mentioned immutability it has several other advantages:

  • Powerful query language with pattern matching.
  • Well-defined and simple database changes description format. With an ability to query with such diffs it has event sourcing out of the box.
  • It builds indexes.

Scala performance for those who doubt

Slides (from jpoint)

My personal favourite of all the fpconf talks.

You can’t measure performance by hand on JVM, because of
  • Deadcode elimination
  • Constant folding
  • Loop unrolling
Tool to use: JMH
Micro-benchmark runner. Has sbt plugin (sbt-jmh)
To find root of some performance problems it’s useful to look at bytecode and even assembler code.
Tools here: JMH perfasm profiler, javap

Measurements

1) Pattern matching
Simple ADT match equals in speed to if-clause sequence.
Null-check is much faster then Option-matching. In general, because of type-erasure, parametrized types pattern matching is slower (but it’s a fair price such feature).

2) Tail recursion
Basically is as fast as loops

3) Collections
Fold and map combinators have significant overhead for big arrays, and especially for primitives as elements.
That’s partly because of HotSpot optimization heuristics are shaped for java.

The problem for primitives is boxing. Scala collections are generics, so specialization doesn’t work for them.

There is an alternative collection library, called Debox, which have specialized Buffer, Set  and Map .

Conclusion

  • Scala is slow:
    • it’s easy to write beautiful, but laggy code
    • collections are super-slow with primitives
    • scalac can generate strange code
  • Scala is fast:
    • with good internals knowledge, beautiful code can work as fast as java code
    • with a few hacks you can make collections be friends with primitives
    • JVM can optimize strange scalac generated code