Fluff Stuff: Better Governments, Better Processes, Simplr Insites

Cities are heading towards bankruptcy. The credit rating of Stockton, CA was downgraded. Harrisburg, PA is actually bankrupt.  It is only a matter of time before Chicago implodes. Since 1995, city debt rose by between $1.3 and $1.8 trillion. While a large chunk of this cost is from waste, more is the result of using intuition over data when tackling infrastructure and new projects. Think of your city government as the boss who likes football more than his job so he builds a football stadium despite your company being in the submarine industry.

This is not an unsolvable nightmare.

Take an effective use case where technologies and government processes were outsourced. As costs rose in Sandy Sprints, GA, the city outsourced and achieved more streamlined processes, better technology, and lower costs. Today, without raising taxes, the city is in the green. While Sandy Springs residents are wealth, even poorer cities can learn from this experience.

Cities run projects in an extremely scientific manner and require an immense amount of clean, quality, well-managed data isolated into individual projects to run appropriately. With an average of $8000 spent per employee on technology each year and with an immense effort spent in acquiring analysts and modernizing infrastructure, cities are struggling to modernize.

It is my opinion, one I am basing a company on, that the provision of quality data management, sharing and collaboration tools, IT infrastructure, and powerful project and statistical management systems in a single SAAS tool can greatly reduce the $8000 per employee cost and improve budgets. These systems can even reduce the amount of administrative staff, allowing money to flow to where it is needed most.

How can a collaborative tool tackle the cost problem. Through:

  • collaborative knowledge sharing of working, ongoing, and even failed solutions
  • public facing project blogs and information on organizations, projects, statistical models, results, and solutions that allow even non-mathematical members of an organization to learn about a project
  • a security minded institutional resource manager (IRM better thought of as a large, securable, shared file storage system) capable of expanding into the petabytes while maintaining FERPA, HIPPA, and state and local regulations
  • the possibility to share data externally, keep it internal, or keep the information completely private while obfuscating names and other protected information
  • complexity analysis (graph based analysis) systems for people, projects, and organizations clustered for comparison
  • strong comparison tools
  • potent and learned aggregation systems with validity in mind ranging from streamed data from sensors and the internet to surveys to uploads
  • powerful drag and drop integration and ETL with mostly automated standardization
  • deep diving upload, data set, project, and model exploration using natural language searching
  • integration with everything from a phone to a tablet to a powerful desktop
  • access controls for sharing the bare minimum amount of information
  • outsourced IT infrastructure including infrastructure for large model building
  • validation using proven algorithms and increased awareness of what that actually means
  • anomaly detection
  • organization of models, data sets, people, and statistical elements into a single space for learning
  • connectors to popular visualizers such as Tableau and Vantara with a customize-able dashboard for general reporting
  • downloadable sets with two entity verification if required that can be streamed or used in Python and R

Tools such as ours significantly reduce the cost of IT by as much as 65%. We eliminate much of the waste in the data science pipeline while trying to be as simple as possible.

We should consider empowering and streamlining the companies, non-profits, and government entities such as schools and planning departments that perform vital services before our own lives are greatly effected. Debt and credit are not solutions to complex problems.

Take a look, or don’t. This is a fluff piece on something I am passionately building. Contact us if you are interested in a beta test.

Advertisements

Security in a Microservices Environment

immunity

Too often, security is overlooked in a system. It is not enough to separate hardware or merely have an SSL certificate. There are some tricks to helping build secure microservices.

This article examines some of the security mechanisms available for helping prevent unauthorized access to a system. Notice the title related to security as nothing is ever 100 percent secured.

Related Articles:

Discovering Sharable Resources in a Microservices Environment

Segmenting Microservices

General Principles

While this article does not cover every potential security measure, a few are presented.

In general, however, a security policy and the following implementation should:

  • cover the entire system as a unit and examine each component
  • provide mechanisms for common logging, bug tracking, and monitoring
  • be weary of data visibility
  • analyze your data
  • provide a mechanism for security auditing and recommendations
  • include penetration testing
  • include policies regarding self-checking, monitoring, and violations
  • make everyone paranoid (cheers!)

Self-Checking

The most vulnerable point in any organization is the employee. Security experts are good but Lola at the front desk probably doesn’t really care as much whether her password is cat or randomized into a million pieces. Of course, long obscure passwords were found to be about as secure as cat recently so maybe Lola is a tad on the efficient side.

It is wise to:

  • phish employees (offer to take them fishing)
  • attack your system with brute force and deploy other mechanisms for evil
  • check for backdoors
  • penetration test using other means

There have been software companies where standard passwords are never changed while proprietary information is routed through the system. Imagine having the ability to type guest and admin into a console, setup a RabbitMQ router and instantly have access to every proprietary real estate listing.

Hardware Separation

Hardware separation is critical for the success of a system. Externally facing applications should never be placed on the same hardware as sensitive services meant for internal consumption.

In general the following should be separated:

  • a web application
  • data storage
  • services accessing data storage
  • services maintaining sensitive information

Simplr Insites, LLC, of which I am 50 percent owner, separates each of these types of services. With the increasing scrutiny placed on IT firms regarding personal information, privacy, and planning, this is not just a good thing but required by law. Colorado’s recent educational privacy laws demand that a security plan be presented for approval before an organization becomes a partner with a school district.

Password Protection

The most obvious point is to protect services via password on any forward facing component.  That means utilizing the most appropriate form of protection and staying abreast of security vulnerabilities.

Blowfish is broken but pbkdf is a recommended algorithm. Make NIST a regular part of your reading material.

Licenses and Tokens

It is not enough to secure front end applications, any access within the system should be authorized. Giving external services tokens for access and utilizing appropriate oauth2 toolkits helps to make unauthorized access more difficult. Again, nothing is impossible.

There is a difference between a license and a token in our systems. A license is often used to give general access to a system. A token grants access to components in the system. In this way, we can revoke specific access rights without invalidating a token or invalidate access to a system altogether.

Always remember to store tokens and licenses correctly and not in plain text anywhere within the system. Using an encryption algorithm such as Fernet encryption is a good start.

Encryption

Some data needs to be protected. This includes identifying information and licenses. It also means obtaining a trusted SSL certificate for each microservice.

Knowing which forms of encryption are considered secure is yet another reason to support the Boulder hippie coalition at NIST. They may be about peace and free love but are still concerned for their privacy.

When secured data needs to be transmitted, store a RSA private key at an appropriate place on the service and ensure that the front end maintains the public key. When data needs to be stored, Fernet encryption might be more recommendable. Of course, be weary of the date this article was published.

Monitoring Networks

Monitoring networks for unusual behavior goes well beyond intuition and Sentry, Zabbix, or the Elastic APM (used as an application logger) these days. These tools are terrific and must be deployed to tackle breaches alongside a strong firewall.

However, advances in neural networks and pattern matching allow security analysts to find anomalies. It might be possible to use a LSTM to detect and block unwanted users in real time.

To this end, companies such as Dark Trace are promising.

A combined approach is recommended as someone will be able to trick any security system. If it is possible, it will happen. Do not drink the cool-aid regarding cyber-immunity but do notice the word play.

Accounting

Related to monitoring a network, log everything that is feasible. If a user accesses a service or causes a job to be scheduled, log that job. If a service generates an error, log the error.

The Elastic APM is a terrific tool for logging, monitoring and bug reporting. Zabbix is available for monitoring hardware. Other companies I have worked for utilize Sentry for bug reporting as well.

Conclusion

A multi-pronged approach to security is a necessity in a micro-services environment. A decent system bridges the gap between services, monitors all traffic as a whole, allows learned security experts to find issues, tracks bugs in a single location, and provides much more.

NotePad: A NickName Recognition Idea

So this is just an idea. Nicknames can apparently be attached to multiple names. That means that we could have a graph of nicknames and also that a direct list may be tricky. A quick thought on how to approach a solution to this.

 

A. Graph the Names and their known connections to formal names.

B. Attach probability data to each Node specifying which characteristics trigger that solution (Markovian sort of)

C. Start a Server accepting requests

D. When a request comes in first look for all potential connections from a nickname to a name, if none save this as a new node. Search can be performed as depth first or breadth first and even split into random points on the graph in a more stochastic mannersuch as a random walk (will need to study these more). This could also be done by just choosing multiple random search points.

E. If there is one node, return the formal name

F. If there are multiple names take one of several approaches:

1. Approach is to calculate Bayesian probabilities given specified characteristics and return the strongest match (this is the weakest solution).

2. Approach is to train a neural net with the appropriate kernel (RBF, linear; etc.) and return the result from this net. (This is slow as having a million neural nets in storage seems like a bad idea)

When generating stand alone nodes, it may be possible to use a Levenshtein distance and other characteristics to attach nickname nodes to formal nodes based on a threshold. A clustering algorithm could use the formal name averages as cluster centers and a hard cutoff could specify (e.g. Levenshtein of 1 or 2) could solidify the connection and throw out Type I and Type II error.

Stay tuned while I flesh this post out in this post with actual code. It may just be a proposal for a while.