Avoiding Duplication Issues in SBT

It goes without saying that any series on sbt and sbt assembly needs to also have a small section on avoiding the dreaded deduplication issue.

This article reviews how to specify merging in sbt assembly as described on the sbt assembly Github page and examines the PathList for added depth.

Related Articles:

Merge Strategies

When building a fat JAR in sbt assembly, it is common to run into the following error:

[error] (*:assembly) deduplicate: different file contents found in the following:

This error proceeds a list of files with duplication issues.

The build.sbt file offers a way to avoid this error via the merge strategy. Using the error output, it is possible to choose an appropriate strategy to deal with duplication issues in assembly:

assemblyMergeStrategy in assembly := {
  case "Logger.scala" => MergeStrategy.first
  case "levels.scala" => MergeStrategy.first
  case "Tidier.scala" => MergeStrategy.first
  case "logback.xml" => MergeStrategy.first
  case "LogFilter.class" => MergeStrategy.first
  case PathList(ps @ _*) if ps.last startsWith "LogFilter" => MergeStrategy.first
  case PathList(ps @ _*) if ps.last startsWith "Logger" => MergeStrategy.first
  case PathList(ps @ _*) if ps.last startsWith "Tidier" => MergeStrategy.first
  case PathList(ps @ _*) if ps.last startsWith "FastDate" => MergeStrategy.first
  case x =>
    val oldStrategy = (assemblyMergeStrategy in assembly).value
    oldStrategy(x)
}

In this instance. The first discovered file listed in the sbt error log is chosen. The PathList obtains the entire path with last choosing the last part of the path.

A file name may be matched directly.

PathList

Sbt merge makes use of the PathList. The full object is quite small:

object PathList {
  private val sysFileSep = System.getProperty("file.separator")
  def unapplySeq(path: String): Option[Seq[String]] = {
    val split = path.split(if (sysFileSep.equals( """\""")) """\\""" else sysFileSep)
    if (split.size == 0) None
    else Some(split.toList)
  }
}

This code utilizes the the specified system separator, “\” by default, to split a path. The return type is a List of strings.

List has some special Scala based properties. For instance, it is possible to search for anything under javax.servlet.* using:

PathList("javax", "servlet", xs @ _*) 

xs @_* searches for anything after the javax.servlet package.

Conclusion

This article reviews some basics of the merge strategy in sbt with a further explanation of the PathList.

Advertisements

2 thoughts on “Avoiding Duplication Issues in SBT

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s