Sbt Pack With Xerial

Adding Jars to a classpath should not be a chore. Often, using retrieveManaged in an SBT build is not quite what we want. When dealing with more than a few minor dependencies, having each dependency placed in its own folder is problematic. This article discusses a solution to this issue using Xerial’s sbt-pack plugin.

 

Xerial Pack

Xerial offers a plugin that will package all jars in a single folder and create a bat for executing the configurable main class. This allows for every jar to be placed on the classpath without listing every folder. It also creates a single directory for all dependencies.

Simply place the following in project/plugins.sbt:

addSbtPlugin("org.xerial.sbt" % "sbt-pack" % "0.8.2")  // for sbt-0.13.x or higher

Then specify the packaging options in build.sbt:

packAutoSettings

In this instance, the main class will be automatically found. More options are discussed at the Xerial Github page.

Packaging the jars requires a single command:

sbt pack

Conclusion

Using the classpath with many dependencies does not need to be a chore. Simply import the Xerial plugin and run sbt pack

Advertisements

Enriching Scala With Implicits

Imagine you are an ETL developer using Spring Cloud Data Flow. Nothing is really available for distributed systems and streaming ETL that is as powerful as this tool. Alteryx and Pentaho are at least a year away from pushing out anything as capable. While Pentaho might work, there are just too many holes to fill.

However, you could do with a more compact code language than Java when programming for Spring. A powerful solution is to combine the Spring ecosystem with Scala, using implicits to eliminate redundant code.

This article focuses on using the enrichment pattern in Scala code through the IterableLike library and the concept of the implicit.

 

clock

Time is Money

Implicits

Implicits allow code that is in scope to be utilized when variables are not defined. Only one implicit of a type may be defined within a class:

implicit val myImplicitString : String = "hello there"

def printHello(str : String)={
   println(str)//should print "hello there"
}

This class creates an implicit string and utilizes it in the method printHello. Having two implicit strings confuses the compiler and causes it to crash.

Implicits create the possibility of using the enrichment pattern as described in the next to last section by attaching an implicit definition to a library when a non-existing function is called.

Remap An Object

Our enrichment example contains a method that removes an item from an IterableLike object only when a specific condition holds:

class IterableScalaFuncs[A,Repr](xs : IterableLike[A,Repr]){

    /**
      * Remove an object by a single property value.
      * @param f              The thunk to use
      * @param cbf            The CanBuildFrom to get a builder which should not be touched
      * @tparam That          The result type
      * @return That which should just be of type A
    */
    def removeObjectMatching[That](f : A => Boolean)(implicit cbf : CanBuildFrom[Repr,A,That]):That={
        val builder = cbf(xs.repr)
        val it = xs.iterator

        while(it.hasNext){
            val o = it.next()
            if(!f(o)){
                builder += o
            }
        }
        builder.result()
    }

}

IterableLikeScalaFuncs contains a removeObjectMatching method that takes the result as a type parameter, the thunk to match with in the parameter list, and implicitly connects the existing CanBuildFrom from IterableLike in the net parameter list. It then creates a builder of our type and proceeds to populate it with objects not matching the thunk before returning a new collection with the appropriate items removed.

Enrichment Pattern

The enrichment pattern in Scala embellishes libraries by appending code to them as opposed to creating a wrapper class which must be instantiated. This requires implicitly attaching method definitions to existing libraries but allows Scala to be easily used from the console and in classes.

The class in the previous section can be attached to all IterableLike collections implicitly:

/**
  * The enrichment for Iterables wrapping IterablLike with IterableScalaFuncs
  * @param xs       Our IterableLike object
  * @tparam A       The type of the Iterable
  * @tparam Repr    Traversable Repr
*/
implicit def enrichIterable[A, Repr](xs: IterableLike[A, Repr]) = new IterableScalaFuncs(xs)

The method enrichIterable attaches to the target collections and is used as follows:

import ScalaFuncImplicits._
val list : List[(Int,Int)] = List[(Int,Int)]((1,2),(2,3)).removeObjectMatching(_.1 == 1) //should product List((2,3))

Conclusion

This article reviewed the power of Scala Implicits to reduce redundant code without accounting for every type. The enrichment pattern can be used to easily integrate methods with existing libraries.

Code is available at Github

Avoiding Duplication Issues in SBT

It goes without saying that any series on sbt and sbt assembly needs to also have a small section on avoiding the dreaded deduplication issue.

This article reviews how to specify merging in sbt assembly as described on the sbt assembly Github page and examines the PathList for added depth.

Related Articles:

Merge Strategies

When building a fat JAR in sbt assembly, it is common to run into the following error:

[error] (*:assembly) deduplicate: different file contents found in the following:

This error proceeds a list of files with duplication issues.

The build.sbt file offers a way to avoid this error via the merge strategy. Using the error output, it is possible to choose an appropriate strategy to deal with duplication issues in assembly:

assemblyMergeStrategy in assembly := {
  case "Logger.scala" => MergeStrategy.first
  case "levels.scala" => MergeStrategy.first
  case "Tidier.scala" => MergeStrategy.first
  case "logback.xml" => MergeStrategy.first
  case "LogFilter.class" => MergeStrategy.first
  case PathList(ps @ _*) if ps.last startsWith "LogFilter" => MergeStrategy.first
  case PathList(ps @ _*) if ps.last startsWith "Logger" => MergeStrategy.first
  case PathList(ps @ _*) if ps.last startsWith "Tidier" => MergeStrategy.first
  case PathList(ps @ _*) if ps.last startsWith "FastDate" => MergeStrategy.first
  case x =>
    val oldStrategy = (assemblyMergeStrategy in assembly).value
    oldStrategy(x)
}

In this instance. The first discovered file listed in the sbt error log is chosen. The PathList obtains the entire path with last choosing the last part of the path.

A file name may be matched directly.

PathList

Sbt merge makes use of the PathList. The full object is quite small:

object PathList {
  private val sysFileSep = System.getProperty("file.separator")
  def unapplySeq(path: String): Option[Seq[String]] = {
    val split = path.split(if (sysFileSep.equals( """\""")) """\\""" else sysFileSep)
    if (split.size == 0) None
    else Some(split.toList)
  }
}

This code utilizes the the specified system separator, “\” by default, to split a path. The return type is a List of strings.

List has some special Scala based properties. For instance, it is possible to search for anything under javax.servlet.* using:

PathList("javax", "servlet", xs @ _*) 

xs @_* searches for anything after the javax.servlet package.

Conclusion

This article reviews some basics of the merge strategy in sbt with a further explanation of the PathList.

A Quick Note on SBT Packaging: Packaging Library Dependencies, Skipping Tests, and Publishing Fat Jars

Sbt is a powerful build tool for recurring builds using in a nearly automated way with the tilde to simplifying the Maven build system.

There is the matter of slow build times with assembly. Avoiding slow builds with sbt assembly and automating the process with sbt ~ package is simple. However, using the output code requires having dependencies.

Even then, compiling a fat JAR for use across an organization may be necessary in some circumstances such as when using native libraries. This can avoid costly build problems.

This article reviews packaging dependencies with sbt assembly, offers tips for avoiding tests in all of sbt, and provides an overview of publishing fat JARS with assembly using the sbt commands publish and publish-local.

Related Articles:

Packaging Dependencies

Getting library dependencies to package alongside your jar is simple. Add the following code to build.sbt

retrieveManaged := true

The jars will be under root at /lib_managed.

Skipping Tests

If you want to avoid testing as well, add:

test in assembly := {} //your static test class array

This tells assembly not to run any tests by passing a dynamic structure with no class name elements.

Publishing An Assembled Fat JAR

There may be a need to publish a final fat JAR locally or to a remote repository such as Nexus. This may create a merge in other assemblies or with other dependencies so care should be exercised.

The following code, direcectly from the sbt Github repository, pushes an assembled jar to a repository:

artifact in (Compile, assembly) := {
  val art = (artifact in (Compile, assembly)).value
  art.copy(`classifier` = Some("assembly"))
}

addArtifact(artifact in (Compile, assembly), assembly)

The sbt publish and sbt publish-local commands should now push a fat JAR to the target repository.

Conclusion

Avoiding running assembly constantly and publishing fat JARS requires a simple change to build.sbt. With them, it is possible to obtain dependencies for running on the classpath, avoid running tests on every assembly, and publishing fat JARS.

An Introduction to Using Spring With Scala: A Positive View with Tips

Many ask why mix Spring and Scala. Why not?

Scala is a resilient language and the movement to the Scala foundation has only made it and its interaction with Java stronger. Scala reduces those many lines of Java clutter to fewer significantly more readable lines of elegant code with ease. It is faster than Python and yet acquiring the same serialization capability while already having a much better functional aspect.

Spring is the powerful go to too for any Java programmer, abstracting everything from web app security and email to wiring a backend. Together, Scala 2.12+ and Spring make a potent duo.

This article examines a few key traits for those using Scala 2.12+ with Spring 3+.

bean

Mr. Bean

Some of the Benefits of Mixing

Don’t recreate the wheel in a language that already uses your favorite Java libraries.

Scala mixed with Spring:

  • Elminate Lines of Java Code
  • Obtains more functional power than Javaslang
  • Makes use of most if not all of the functionality of Spring in Scala
  • Places powerful streamlined threading capacity in the hands of the programmer
  • Creates much broader serialization capacity than Java

This is a non-exhaustive list of benefits.

When Dependency Injection Matters

Dependency injection is useful in many situations. At my workplace, I have developed tools that reduce thousands of lines of code to a few configuration scripts using Scala and Spring.

Dependency injection is useful when writing large amounts of code hinders productivity. It may be less useful when speed is the primary concern.

Annotation Configs

Every annotation in Spring works with Scala. @Service, @Controller, and the remaining stereotypes, @Autowired and all of the popular annotations are useable.

Using them is the same in Java as in Scala.

@Service
class QAController{
   ....
}

Scala.Beans

Unfortunately, Scala does not create getters and setters for beans. It is therefore necessary to use the specialized @BeanProperty from Scala.Beans. This property cannot be attached to a private variable.

@BeanProperty
val conf: String = null

If generating a boolean getter and setter, @BooleanBeanProperty should be used.

@BooleanBeanProperty
val isWorking : Boolean = false

Scala’s Beans package contains other useful tools that give some power over configuration.

Autowiring

Autowiring does not require jumping through hoops. The underlying principal is the same as when using Java. It is only necessary to combine the @BeanProperty with @Autowired.

@BeanProperty
@Autowired(required = false)
val tableConf : TableConfigurator = null

Here, the autowired tableConf property is not required.

Configuration Classes

The XML context in Spring is slated for deprecation. To make code that will last, it is necessary to use a configuration class. Scala works seamlessly with the @Configuration component.

@Configuration
class AppConfig{
  @Bean
  def getAnomalyConfigurator():AnomalyConfigurator= {
    new AnomalyConfigurator {
      override val maxConcurrent: Int = 3
      override val errorAcceptanceCriteria: Double = 5.0
      override val columLevelStatsTable: String = "test"
      override val maxHardDifference: Int = 100
      override val schemaLevelStatsTable: String = "test"
      override val runId: Int = 1
    }
  }


  @Bean
  def getQAController():QAController={
    new QAController
  }
}

As with Java, the configuration generates beans. There is no difference in useage between the languages.

Create Classes without Defining Them

One of the more interesting features of Java is the ability to define a class as needed. Scala can do this as well.

new AnomalyConfigurator {
      ... //properties to use
}

This feature is useful when creating configuration classes whose traits take advantage of Spring.

Creating a Context

Creating contexts is the same in Scala as in Java. The difference is that classOf[] must be used in place of the .class property to obtain class names.

val context: AnnotationConfigApplicationContext = new AnnotationConfigApplicationContext()
context.register(classOf[AppConfig])
context.refresh()
val controller: QAController = context.getBean(classOf[QAController])

Conclusion

Scala 2.12+ works seamlessly with Spring. Requisite annotations and tools are all available in the language which compiles to comparable Java byte code.

Code for this article is available on Github

Faster Compilation of Scala Jars

Compiling Scala code is a pain. For testing and even deployment, there is a much faster method to deploy Scala code. Consider a recent project I am working on. The addition of three large fat jars and OpenCV caused a compile time of roughly ten minutes. Obviously, that is not production quality. As much as it is fun to check how the president is causing stocks to roller coaster when he speaks or yell explitives around co-workers, it is probably not desirable.

This article reviews continuous packaging with sbt. The sbt assembly project is not covered here as in other articles in the series.

Related Articles:

Repeated Compiling and Packaging

Sbt offers an extremely easy way to continuously package sources. Use it.

sbt ~ package

The tilde works for any command and forces SBT to wait for a source change. For instance, ~ compile has the same effect as ~ package but runs the compile command instead of package.

This spins up a check for source file changes and packages sources whenever changes are discovered. Pressing enter will exit the process.

Running the Packaged Sources

We were using sbt assembly for standalone jars. In test, this is fast becoming a bad idea. Instead, place a large dependency jar or multiple smaller dependency jars in a folder and add them to the class path. The following code took compilation from 10 minutes to under one. Packaging takes mere seconds but a full update will take longer.

$ sbt clean
$ sbt package
$ java -cp ".:/path/to/my/jars/*" package.path.to.main.Main [arguments]

This set of commands cleans, packages, and then runs the Main class with your arguments.

This works miracles. The classpath is system dependent, a colon separates entries in Linux and a semi-colon separates entries in Windows.

It is possible to avoid building fat jars using this method. This greatly reduces packaging times which can grow to minutes. In combination with a continuous build process, it takes a fraction of the time to run Scala code as when using the assembly command.

Make Sure You Use the Right Jars

When moving to running with the classpath command and continual packaging in Scala, any merge strategy will disappear. To avoid this, try creating an all in one jar and only placing this in your classpath alongside any jar containing your main class.

Conclusion

This article reviewed a way to more efficiently package and run scala jars without waiting for assembly. The method here works well in test or in a well defined environment.

JavaCV Basics: Basic Image Processing

Here, we analyze some of the basic image processing tools in OpenCV and their use in GoatImage.

All code is available on GitHub under GoatImage. To fully understand this article, read the related articles and look at this code.

Select functions are exemplified here. In GoatImage, JavaDocs can be generated further explaining the functions. The functions explained are:

  • Sharpen
  • Contrast
  • Blur

Dilate, rotate, erode, min thresholding, and max thresholding are left to the code. Thresholding in OpenCV is described in depth with graphs and charts via the documentation.

Related Articles:

Basic Processing in Computer Vision

Basic processing is the key to successful recognition. Training sets come in a specific form. Pre-processing is usually required to ensure the accuracy and quality of a program. JavaCV and OpenCV are fast enough to work in a variety of circumstances to improve algorithmic performance at a much lower speed reduction cost. Each transform applied to an image takes time and memory but will pay off handsomely if done correctly.

Kernel Processing

Most of these functions are linear transformations. A linear transformation uses a function to map one matrix to another (Ax = b). In image processing, the matrix kernel is used to do this. Basically a weighted matrix can be used to map a certain point or pixel value.

For an overview of image processing kernels, see wikipedia.

Kernels may be generated in JavaCV.


  /**
    * Create a kernel from a double array (write large kernels more understandably)
    * @param kernelArray      The double array of doubles with the kernel values as signed ints
    * @return                 The kernel mat
    */
  def generateKernel(kernelArray: Array[Array[Int]]):Mat={
    val m = if(kernelArray != null) kernelArray.length else 0
    if(m == 0 ){
      throw new IllegalStateException("Your Kernel Array Must be Initialized with values")
    }

    if(kernelArray(0).length != m){
      throw new IllegalStateException("Your Kernel Array Must be Square and not sparse.")
    }

    val kernel = new Mat(m,m,CV_32F,new Scalar(0))
    val ki = kernel.createIndexer().asInstanceOf[FloatIndexer]

    for(i <- 0 until m){
      for(j <- 0 until m){
        ki.put(i,j,kernelArray(i)(j))
      }
    }
    kernel
  }

More reliably, there is a function for generating a Gaussian Kernel.

/**
    * Generate the square gaussian kernel. I think the pattern is a(0,0)=1 a(1,0) = n a(2,0) = n+2i with rows as a(2,1) = a(2,0) * n and adding two to the middle then subtracting.
    * However, there were only two examples on the page I found so do not use that without verification.
    *
    * @param kernelMN    The m and n for our kernel matrix
    * @param sigma       The sigma to multiply by (kernel standard deviation)
    * @return            The resulting kernel matrix
    */
  def generateGaussianKernel(kernelMN : Int, sigma : Double):Mat={
    getGaussianKernel(kernelMN,sigma)
  }

Sharpen with A Cutom Kernel

Applying a kernel in OpenCV can be done with the filter2D method.

filter2D(srcMat,outMat,srcMat.depth(),kernel)

Here a sharpening kernel using the function above is applied.

/**
    * Sharpen an image with a standard sharpening kernel.
    * @param image    The image to sharpen
    * @return         A new and sharper image
    */
  def sharpen(image : Image):Image={
    val srcMat = new Mat(image.image)
    val outMat = new Mat(srcMat.rows(),srcMat.cols(),srcMat.`type`())

    val karr : Array[Array[Int]] = Array[Array[Int]](Array(0,-1,0),Array(-1,5,-1),Array(0,-1,0))
    val kernel : Mat = this.generateKernel(karr)
    filter2D(srcMat,outMat,srcMat.depth(),kernel)
    new Image(new IplImage(outMat),image.name,image.itype)
  }

Contrast

Contrast kicks up the color intensity in images by equation, equalization, or based on neighboring pixels.

One form of Contrast applies a direct function to an image:

/**
    * Use an equation applied to the pixels to increase contrast. It appears that
    * the majority of the effect occurs from converting back and forth with a very
    * minor impact for the values. However, the impact is softer than with equalizing
    * histograms. Try sharpen as well. The kernel kicks up contrast around edges.
    *
    * (maxIntensity/phi)*(x/(maxIntensity/theta))**0.5
    *
    * @param image                The image to use
    * @param maxIntensity         The maximum intensity (numerator)
    * @param phi                  Phi value to use
    * @param theta                Theta value to use
    * @return
    */
  def contrast(image : Image, maxIntensity : Double, phi : Double = 0.5, theta : Double = 0.5):Image={
    val srcMat = new Mat(image.image)
    val outMat = new Mat(srcMat.rows(),srcMat.cols(),srcMat.`type`())

    val usrcMat = new Mat()
    val dest = new Mat(srcMat.rows(),srcMat.cols(),usrcMat.`type`())
    srcMat.convertTo(usrcMat,CV_32F,1/255.0,0)

    pow(usrcMat,0.5,dest)
    multiply(dest,(maxIntensity / phi))
    val fm = 1 / Math.pow(maxIntensity / theta,0.5)
    multiply(dest, fm)
    dest.convertTo(outMat,CV_8U,255.0,0)

    new Image(new IplImage(outMat),image.name,image.itype)
  }

Here the image is manipulated using matrix equations to form a new image where pixel intensities are improved for clarity.

Another form of contrast equalizes the image histogram:

/**
* A form of contrast based around equalizing image histograms.
*
* @param image The image to equalize
* @return A new Image
*/
def equalizeHistogram(image : Image):Image={
val srcMat = new Mat(image.image)
val outMat = new Mat(srcMat.rows(),srcMat.cols(),srcMat.`type`())
equalizeHist(srcMat,outMat)
new Image(new IplImage(outMat),image.name,image.itype)
}

The JavaCV method equalizeHist is used here.

Blur

Blurring uses averaging to dull images.

Gaussian blurring uses a Gaussian derived kernel to blur. This kernel uses an averaging function as opposed to equal weighting of neighboring pixels.

 /**
    * Perform a Gaussian blur. The larger the kernel the more blurred the image will be.
    *
    * @param image              The image to use
    * @param degree             Strength of the blur
    * @param kernelMN           The kernel height and width should match (for instance 5x5)
    * @param sigma              The sigma to use in generating the matrix
    * @param depth              The depth to use
    * @param brightenFactor     A factor to brighten the result by with  0){
      outImage = this.brighten(outImage,brightenFactor)
    }
    outImage
  }

A box blur uses a straight kernel to blur, often weighting pixels equally.

 /**
    * Perform a box blur and return a new Image. Increasing the factor has a significant impact.
    * This algorithm tends to be overly powerful. It wiped the lines out of my test image.
    *
    * @param image   The Image object
    * @param depth   The depth to use with -1 as default corresponding to image.depth
    * @return        A new Image
    */
  def boxBlur(image : Image,factor: Int = 1,depth : Int = -1):Image={
    val srcMat = new Mat(image.image)
    val outMat = new Mat(srcMat.rows(),srcMat.cols(),srcMat.`type`())

    //build kernel
    val kernel : Mat = this.generateKernel(Array(Array(factor,factor,factor),Array(factor,factor,factor),Array(factor,factor,factor)))
    divide(kernel,9.0)

    //apply kernel
    filter2D(srcMat,outMat, depth, kernel)

    new Image(new IplImage(outMat),image.name,image.itype)
  }

Unsharp Masking

Once a blurred Mat is achieved, it is possible to perform an unsharp mask. The unsharp mask brings out certain features by subtracting the blurred image from the original while taking into account an aditional factor.

def unsharpMask(image : Image, kernelMN : Int = 3, sigma : Double = 60,alpha : Double = 1.5, beta : Double= -0.5,gamma : Double = 2.0,brightenFactor : Int = 0):Image={
    val srcMat : Mat = new Mat(image.image)
    val outMat = new Mat(srcMat.rows(),srcMat.cols(),srcMat.`type`())
    val retMat = new Mat(srcMat.rows(),srcMat.cols(),srcMat.`type`())

    //using htese methods allows the matrix kernel size to grow
    GaussianBlur(srcMat,outMat,new Size(kernelMN,kernelMN),sigma)
    addWeighted(srcMat,alpha,outMat,beta,gamma,retMat)

    var outImage : Image = new Image(new IplImage(outMat),image.name,image.itype)

    if(brightenFactor > 0){
      outImage = this.brighten(outImage,brightenFactor)
    }

    outImage
  }

Conclusion

This article examined various image processing techniques.