Morning Joe: TOR May be Broken but Its Still Useful

TOR, unlike someone with higher powers wants you to believe is dead. In fact, a presentation you never heard about something that isn’t supposed to happen was pulled from devcon in Las Vegas just last week. However, TOR, not anonymous without improvement, is still a useful proxy server.

Consider how, every once in a while, I want to pull data from a site in a quick amount of time. I do not want to spend money on proxies since a client is only paying $50 per sheet of 10,000 to 20,000 records. The answer is TOR. Four to five proxies later with a decent timeout and I am acquiring data without paying $20 or more for proxies.

Tor may seem dead but long live TOR!

Java Edge Detection and Imaging SDK: Get Involved, Create a Better Java Tool

Searching for a way to parse certain files in java with the re-usability of the Spring framework, I discovered that there are not many imaging SDK tools available with the power of Pythons PIL or pillow. I decided to go back to the drawing board. The issue was made more important by the need to run the tools on a server without much a graphics process. My servers tend to have native graphics processing capability, 4 gb of RAM, and a dual core 2.0 ghz processor.

In response, I began writing an imaging tool kit. Documentation and the classes are available to be forked and uploaded to on github.

Current Features (all homebrew with implementations of common theory) currently include but are not limited to:

  • Gaussian Blur
  • Box Blur
  • Sharpen
  • Denoise
  • Color Inversion
  • Decluttering
  • Canny Edge Detection

Memory use is improved with the use of the proxy pattern and speed improvements include the use of greedy algorithms.

The Benefits of combining Fork Join pools with PostgresSQL

Warning: As I am incredibly busy at the moment, benchmarks are not necessarily provided. This is a performance review based on work experience and the accompanying documents for my superiors.

Postgresql is a terrific free tool. In my opinion, it can do almost all than an Oracle product and has quick functions for performing critical tasks such as dumping an entire database in a CSV from a programming language. However, insertion can be slow. I am constantly updating three or more databases with 15 or more attributes a piece. Unfortunately, my company makes an estimated revenue of the cost of an Oracle or Microsoft license. While PostgresSQL developers promise multi-threading in the future, Java has actually solved a significant portion of this task for us along with Apache using Fork Join Pools. The result can be an improvement by thousands of records.

Fork Join Pools and Why to use SE 8

Java SE 7 introduced a Fork Join Pool to the concurrent processes. Oracle recommends only using this for extremely intensive tasks. The tool itself works by sharing tasks among threads, work-stealing. Java SE 8 improves on the algorithm and reduces the number of tasks which are dropped.

Setting Up the Pool In Spring 4

I reviewed several different connection pools before settling on the vastly improved and newest version of Apache DBCP. BoneCP offers improved performance but at the cost of ignoring critical flaws. Chiefly, the current version fails to close connections properly leading the developer to recommend that I revert to version 0.8. The new Apache DBCP outperformed C3P0 in my benchmark tests. My outlook is provided below. It is based on a connection between both a machine and the internet to a co-location and from a machine to a local machine.

Connection Pool Pros Cons
Apache DBCP Reliable Somewhat slower than BoneCP
BoneCP Fast Somewhat Unreliable
C3PO Reliable Slow and worse option than DBCP.

Setting Up the DBCP is incredibly simple and configurable. Spring uses gradle or XML for configuration. For my purpose, XML provided a decent option since my teammates use XML in their daily work. For that reason, the XML configuration is provided below. A decent way to set this up is to use the data source properties bean as a reference in the declaration of the DAO Template’s data source.

<bean id="dataSource" destroy-method="close"
    <property name="driverClassName" value="${jdbcdriver}"/>
    <property name="url" value="${jdbcurl}"/>
    <property name="username" value="${dbcusername}"/>
    <property name="password" value="${dbcpass}"/>
    <property name="initialSize" value="3"/>
    <property name="validationQuery" value="Select 1" />

Other options are provided at the DBCP site. A validation query is provided here since I ran into an issue with the validation query in my own work.

A reference can be provided to the DAO template using:

     <bean id="jdbcTemplate" ref="dataSource"/>

Delivering the Code with a DAO Template: Avoiding Race Conditions

An important consideration is how multiple connections will be handled by the DAO Template. Fortunately, Java offers asynchronization and Spring allows for this to be declared even without declaring a method to be synchronized using the annotation @Async.

   protected void postData(Map data){


The method above is now Asynchronously accessed.

Spring accepts the upper class in the collection hierarchy that more common collections implement or extend. Basically, due to inheritance, the HashMap “is-a” map but Map cannot be recast to HashMap. Call the getClass() method to see this in action. If .getClass() returns java.util.HashMap, it is possible to recast Map to HashMap to gain any benefits beyond implementing the Map interface. Objects are passed by reference, by memory location more specifically, so this should be the case.

The Fork Join Pool

In this instance, the Fork Join Pool should accept Recursive Actions. They are not only useful in recursive tasks. Due to the work sharing, they improve I/O and other intensive tasks as well. Oddly, I ran into an issue where methods returning values with RecursiveTask (my parsers) would not completely close. I switched to using Callables instead. Fork Join Pools accept both Runnable, Callable “is-a” runnable, and ForkJoinTask classes explicitly. Threads, Callables, Runnable, ForkJoinTask, and RecursiveAction objects are acceptable.

Instantiation for this task requires:

ForkJoinPool fjp=new ForkJoinPool(Runtime.getRuntime().availableProcessors()*procnum);

The number of processes is the argument. Anything that does not fit an integer throws an IllegalArgumentException.

Keep in mind that Armdahl’s rule applies and that too many processes will cause the process to slow down, as does inappropriate use.

To submit a process use:

fjp.execute(new SQLRecursiveAction(/**parameters**/));

The SQL class (in this case an inner class) would be:

private SQLRecursiveAction extends RecursiveAction{
      private final String aString;
      public SQLRecursiveAction(String aString){


      protected void compute(){
            /**call to DAO Template**/

The compute method is required as it implements a protected abstract method (abstract methods must at least be default and accessible in abstract classes).

Once ready, do not shutdown the Fork Join Task. Instead, there are other ways to wait for completion that allow reuse.

int w=0;
while(fjp.isQuiescent() == false && fjp.getActiveThreadCount()>0){
//"Waited for "+w+" cycles");

When complete, shutdown is necessary. This should be done when finished with the pool.

fjp.shutdown() //orderly shutdown
//fjp.shutdownNow() for immediate shutdown


In the end, the Fork Join Pool significantly improved the performance of my application which can parse any type of document into attributes stored in JSon strings and posted to the database. The number of records I was able to parse and post increased dramatically from several hundred over one minute to over 10,000 records with the same number of attributes. It may also be wise to consider other improvements such as TokuTech’s fractal tree indexing for large and normalized databases.