New Programming Languages to Watch

Programming ultimately is about building a better mousetrap, and sometimes that includes the programming languages themselves. Today, there are at least a dozen up-and-coming languages vying to become the next C++ or Java.

Should you care? The definitive answer: It depends. One big factor is whether an emerging language stands a chance of building a following among developers, vendors and the rest of the IT industry, or whether it’s doomed to be a historical footnote. That’s always tough to predict, but a newcomer stands a better chance when its creator is a major IT player, such as Google, which created Dart as a replacement for JavaScript.

“For any language that hopes to play a similar role to JavaScript, such as Dart, some level of support would have to come from browser vendors,” says Al Hilwa, applications development software program director at IDC, an analyst firm. “Google would have to come up with a plug-in for each browser that compiles to JavaScript, and some browsers may block that for one reason or another.”

The role of a major vendor isn’t limited to creating the language itself. In some cases, the vendor can be influential when it creates an ecosystem and then encourages use of a certain language.

“For example, Microsoft pushed C# hard when it launched .NET a decade ago, which resulted in great traction for that language,” says Hilwa. “The Microsoft developer system is somewhat unique in that, up to this point, Microsoft has huge credibility in moving the ecosystem with its actions given its dominance in personal computing. But even then, many languages are made available to please specific factions of developers, and not necessarily because they are expected to dominate. I would put F# in this category.”

When a particular language is able to build market share, it’s often at the expense of an incumbent -- hence the better mousetrap analogy. Time will tell whether that’s the case with Dart and JavaScript.

“JavaScript is much maligned, though it’s often used primarily as a syntax approach for all manner of different specific semantic implementations,” says Hilwa. “Browsers purport to support JavaScript but have great variations. Other server technologies use the JavaScript syntax but solve different types of problems, such as Node for asynchronous server computing.”

Emerging languages also have to overcome the fact that incumbency often has its privileges -- or at least an installed base of products and people that are unwilling or unable to change.

“The issue with launching new languages to replace old and potentially inferior ones is the body of code written and the mass of developers who have vested skills,” says Hilwa. “It’s very hard to create a shift in the industry due to these effects. Such a change requires a high level of industry consensus and a sustained multivendor push that involves the key influential vendors.

“For example, if IBM was to put its power behind Dart, then that might help it. It may be hard for Google to muster up such long-term commitment given its culture of focus on the hot and new.”

Meet Dart, F# and Fantom

Here’s an overview of three emerging languages that, at this point, seem to have enough market momentum that enterprises should keep an eye on them:

  • Dart is a class-based language that’s supposed to make it easier to develop, debug and maintain Web applications, particularly large and thus unwieldy ones. It also uses syntax similar to that of JavaScript, which should be helpful for learning Dart. Dart’s creators say one goal is to make the language applicable for all devices that use the Web, including smartphones, tablets and laptops, as well as all major browsers.
  • F# is one of the elder newcomers in the sense that Microsoft began shipping it with Visual Studio 2010. Pronounced “F sharp,” it’s a functional-style language that’s designed to be easy to integrate with imperative languages such as C++ and Java. It also supports parallel programming, which is increasingly important as multicore processors become more common.

  • Fantom is designed to enable cross-platform development, spanning Java VM, .NET CLR and JavaScript in browsers. “But getting a language to run on both Java and .NET is the easy part,” say its creators. “The hard part is getting portable APIs. Fantom provides a set of APIs that abstract away the Java and .NET APIs. We actually consider this one of Fantom’s primary benefits, because it gives us a chance to develop a suite of system APIs that are elegant and easy to use compared to the Java and .NET counterparts.” Fantom’s ease of portability means that it eventually could be extended for use with Objective-C for the iPhone, LLVM or Parrot.

Is ParalleX This Year’s Model?

Scientific application developers have masses of computing power at their disposal with today’s crop of high-end machines and clusters. The trick, however, is harnessing that power effectively. Earlier this year, Louisiana State University’s Center for Computation & Technology (CCT) released its approach to the problem: an open-source runtime system implementation of the ParalleX execution model. ParalleX aims to replace, at least for some types of applications, the Communicating Sequential Processes (CSP) model and the well-established Message Passing Interface (MPI), a programming model for high-performance computing. The runtime system, dubbed High Performance ParalleX (HPX) is a library of C++ functions that targets parallel computing architectures. Hartmut Kaiser -- lead of CTT’s Systems Technology, Emergent Parallelism, and Algorithm Research (STE||AR) group and adjunct associate research professor of the Department of Computer Science at LSU -- recently discussed ParalleX with Intelligence in Software.

Q: The HPX announcement says that HPX seeks to address scalability for “dynamic adaptive and irregular computational problems.” What are some examples of those problems?

Hartmut Kaiser: If you look around today, you see that there’s a whole class of parallel applications -- big simulations running on supercomputers -- which are what I call “scaling-impaired.” Those applications can scale up to a couple of thousand nodes, but the scientists who wrote those applications usually need much more compute power. The simulations they have today have to run for months in order to have the proper results.

One very prominent example is the analysis of gamma ray bursts, an astrophysics problem. Physicists try to examine what happens when two neutron stars collide or two black holes collide. During the collision, they merge. During that merge process, a huge energy eruption happens, which is a particle beam sent out along the axis of rotation of the resulting star or, most often, a black hole. These gamma ray beams are the brightest energy source we have in the universe, and physicists are very interested in analyzing them. The types of applications physicists have today only cover a small part of the physics they want to see, and the simulations have to run for weeks or months.

And the reason for that is those applications don’t scale. You can throw more compute resources at them, but they can’t run faster. If you compare the number of nodes these applications can use efficiently -- an order of a thousand -- and compare that with the available compute power on high-end machines today -- nodes numbering in the hundreds of thousands, you can see the frustration of the physicists. At the end of this decade, we expect to have machines providing millions of cores and billion-way parallelism.

The problem is an imbalance of the data distributed over the computer. Some parts of a simulation work on a little data and other parts work on a huge amount of data.

Another example: graph-related applications where certain government agencies are very interested in analyzing graph data based on social networks. They want to analyze certain behavioral patterns expressed in the social networks and in the interdependencies of the nodes in the graph. The graph is so huge it doesn’t fit in the memory of a single node anymore. They are imbalanced: Some regions of the graph are highly connected, and some graph regions are almost disconnected between each other. The irregularly distributed graph data structure creates an imbalance. A lot of simulation programs are facing that problem.

Q: So where specifically do CSP and MPI run into problems?

H.K.: Let’s try to do an analogy as to why these applications are scaling-impaired. What are the reasons for them to not be able to scale out? The reason, I believe, can be found in the “four horsemen”: Starvation, Latency, Overhead, and Waiting for contention resolution -- slow. Those four factors are the ones that limit the scalability of our applications today.

If you look at classical MPI applications, they are written for timestep-based simulation. You repeat the timestep evolution over and over again until you are close to the solution you are looking for. It’s an iterative method for solving differential equations. When you distribute the data onto several nodes, you cut the data apart into small chunks, and each node works on part of the data. After each timestep, you have to exchange information on the boundary between the neighboring data chunks -- as distributed over the nodes -- to make the solution stable.

The code that is running on the different nodes is kind of in lockstep. All the nodes do the timestep computation at the same time, and then the data exchange between the nodes happens at the same time. And then it goes to computation and back to communication again. You create an implicit barrier after each timestep, when each node has to wait for all other nodes to join the communication phase. That works fairly well if all the nodes have roughly the same amount of work to do. If certain nodes in your system have a lot more work to do than the others -- 10 times or 100 times more work -- what happens is 90 percent of the nodes have to wait for 10 percent of the nodes that have to do more work. That is exactly where these imbalances play their role. The heavier the imbalance in data distribution, the more wait time you insert in the simulation.

That is the reason that MPI usually doesn’t work well with very irregular programs, more concretely -- you will have to invest a lot more effort into the development of those programs -- a task not seldom beyond the abilities of the domain scientists and outside the constraints of a particular project. You are very seldom able to evenly distribute data over the system so that each node has the same amount of work, or it is just not practical to do so because you have dynamic, structural changes in your simulation.

I don’t want to convey the idea that MPI is bad or something not useful. It has been used for more than 15 years now, with high success for a certain class of simulations and a certain class of applications. And it will be used in 10 years for a certain class of applications. But it is not well-fitted for the type of irregular problems we are looking at.

ParalleX and its implementation in HPX rely on a couple of very old ideas, some of them published in the 1970s, in addition to some new ideas which, in combination, allow us to address the challenges we have to address to utilize today’s and tomorrow’s high-end computing systems: energy, resiliency, efficiency and -- certainly -- application scalability. ParalleX is defining a new model of execution, a new approach to how our programs function. ParalleX improves efficiency by exposing new forms of -- preferably fine-grain -- parallelism, by reducing average synchronization and scheduling overhead, by increasing system utilization through full asynchrony of workflow, and employing adaptive scheduling and routing to mitigate contention. It relies on data-directed, message-driven computation, and it exploits the implicit parallelism of dynamic graphs as encoded in their intrinsic metadata. ParalleX prefers methods that allow it to hide latencies -- not methods for latency avoidance. It prefers “moving work to the data” over “moving data to the work,” and it eliminates global barriers, replacing them with constraint-based, fine-grain synchronization techniques.

Q: How did you get involved with ParalleX?

H.K.: The initial conceptual ideas and a lot of the theoretical work have been done by Thomas Sterling. He is the intellectual spearhead behind ParalleX. He was at LSU for five or six years, and he left only last summer for Indiana University. While he was at LSU, I just got interested in what he was doing and we started to collaborate on developing HPX.

Now that he’s left for Indiana, Sterling is building his own group there. But we still tightly collaborate on projects and on the ideas of ParalleX, and he is still very interested in our implementation of it.

Q: I realize HPX is still quite new, but what kind of reception has it had thus far? Have people started developing applications with it?

H.K.: What we are doing with HPX is clearly experimental. The implementation of the runtime system itself is very much a moving target. It is still evolving.

ParalleX -- and the runtime system -- is something completely new, which means it’s not the first-choice target for application developers. On the other hand, we have at least three groups that are very interested in the work we are doing. Indiana University is working on the development of certain physics and astrophysics community applications. And we are collaborating with our astrophysicists here at LSU. They face the same problem: They have to run simulations for months, and they want to find a way out of that dilemma. And there’s a group in Paris that works on providing tools for people who write code in MATLAB, a high-level toolkit widely used by physicists to write simulations. But it’s not very fast, so the Paris group is writing a tool to covert MATLAB to C++, so the same simulations can run a lot faster. They want to integrate HPX in their tool.

ParalleX and HPX don’t have the visibility of the MPI community yet, but the interest is clearly increasing. We have some national funding from DARPA and NSF. We hope to get funding from the Department of Energy in the future; we just submitted a proposal. We expect many more people will gain interest once we can present more results in the future.


What’s Next For OpenACC?

The OpenACC parallel programming standard emerged late last year with the goal of making it easier for developers to tap graphics process units (GPUs) to accelerate applications. The scientific and technical programming community is a key audience for this development. Jeffrey Vetter, professor at Georgia Tech’s College of Computing and leader of the Future Technologies Group at Oak Ridge National Laboratory, recently discussed the standard. He is currently project director for the National Science Foundation’s (NSF) Track 2D Experimental Computer Facility, a cooperative effort that involves the Georgia Institute of Technology and Oak Ridge, among other institutions. Track 2D’s Keeneland Project employs GPUs for large-scale heterogeneous computing.

Q: What problems does OpenACC address?

Jeffrey Vetter: We have this Keeneland Project -- 360 GPUs deployed in its initial delivery system. We are responsible for making that available to users across NSF. The thinking behind OpenACC is that all of those people may not have the expertise or funding to write CUDA code or OpenCL code for all of their scientific applications.

Some science codes are large, and any rewriting of them -- whether it is for acceleration or a new architecture of any type -- creates another version of the code and the need to maintain that software. Some of the teams, like the climate modeling team, just don’t want to do that. They have validated their codes. They have a verification test that they run, and they don’t want to have different versions of their code floating around.

It is a common problem in software engineering: People branch their code to add more capability to it, and at some point they have to branch it back together again. In some cases, it causes conflict.

OpenACC really lets you keep your applications looking like normal C or C++ or Fortran code, and you can go in and put the pragmas in the code. It’s just an annotation on the code that’s available to the compiler. The compiler takes that and says, “The user thinks this particular block or structured loop is a good candidate for acceleration.”

Q: What’s the impact on scientific/technical users?

J.V.: We have certain groups of users that are very sophisticated and willing to do most anything to port their code to a GPU -- write new version of code, sit down with an architecture expert and optimize it.

But some don’t want to write any new code other than putting pragmas in the code. They really are conservative in that respect. A lot of the large codes out there used by DOE labs just haven’t been ported to GPUs because there’s uncertainty over what sort of performance improvement they might see, as well as a lack of time to just go and explore that space.

What we are trying to do is broaden the user base on the system and make GPUs, and in fact other types of accelerators, more relevant for other users who are more conservative.

After a week of just going through the OpenACC tutorials, users should be able to go in and start experimenting with accelerating certain chunks of their applications. And those would be people who don’t have experience in CUDA or OpenCL.

Q: Does OpenACC have sufficient support at this point?

J.V.: PGI, CAPS and Cray: We expect they will start adhering to OpenACC with not too much trouble. What’s less certain is how libraries and performance analysis tools and debugging tools will work with the new standard. One thing that someone needs to make happen is to ensure that there is really a development environment around OpenACC.

OpenMP was a decade ago -- they had the same issue. They had to create the specification and the pragmas and other language constructs, and people had to create the runtime system that executes the code and does the data movement.

Q: What types of applications need acceleration?

J.V.: Generally, we have been looking at applications that have this high computational intensity. You have things like molecular dynamics and reverse time migration and financial modeling -- things that basically have the characteristic that you take a kernel and put it in a GPU and it runs there for many iterations, without having to transfer data off the GPU.

OpenACC itself is really targeted at kernels that have a structured block or a structured loop that is regular. That limits the applicability of the compiler to certain applications. There will be applications with unstructured mesh or loops that are irregular in that they have conditions or some type of compound statements that make it impossible for the compiler to analyze. Users will have to unroll those loops so an OpenACC compiler has enough information to generate the code.

Some kernels are not going to work well on OpenACC whether you work manually or with a compiler. There isn’t any magic. I’m supportive, but trust and verify.

Does Your Enterprise Need Its Own App Store?

By 2015, more than 65 percent of North American business users will have a smartphone, and more than 26 percent of enterprises have currently deployed tablets or are at least considering them. That adoption has many enterprises developing smartphone and tablet apps for internal use. These business-to-employee apps are part of a category that will have more than 830 million users by 2016, ABI Research predicts.

These trends have enterprises such as IBM and Medtronic creating internal app stores, which ensure that employees, contractors and other authorized users get the apps that match their device models and job responsibilities. It’s a strategy built around security, productivity and convenience. Private app stores:

  • Enhance employee productivity. When the enterprise’s app store ensures that employees get the right app version for their model of smartphone or tablet, they’re more productive because they’re not tying up the help desk trying to make the wrong version work. 

  • Ensure that employees securely get the right app/data based on their responsibilities when the app store is properly configured with this functionality. For example, the EMEA sales team would get sales-related apps for their region instead of the APAC versions. Just as important, employees who don’t work in sales – as well as non-employees – can’t download those apps, thus preventing unauthorized access to the information that the sales apps provide. “For an enterprise store, you’re doing app distribution based on entitlements and roles, which means that you have to have tight integration and secure access with your identity infrastructure,” says Chris Perret, CEO of Nukona, a vendor that specializes in enterprise app stores.

  • Provide enterprises with deeper insights into who’s using their apps and how. For example, Apple’s App Store ensures end user privacy by giving developers only high-level statistics such as the volume of downloads on a weekly basis. But at Medtronic’s internal store, which launched in January 2011, everyone has to log in as an employee or contractor, which gives the company’s developers more insight. “They often want to be able to identify who’s installing their app, how long they’ve had it and if they’ve installed a new version,” says Jim Freeland, head of Medtronic’s enterprise mobility group. 

  • Offer access to user data that enables developers to determine whether their target audience is actually using their app. They also conveniently can send alerts to users when a new version is available. “You have a better way to stay in touch with users than what the App Store can provide,” says Freeland. Like the App Store and Android Market, private app stores sometimes give users the opportunity to rate and review apps, providing developers with additional insights.

Couldn’t enterprises make their private app stores accessible to customers and resellers too? Not necessarily. Medtronic says its store will always be open only to employees and contractors. “We are unable to distribute apps to customers, patients or third parties other than going through the iTunes App Store,” says Freeland. “That’s an agreement that Apple has with all companies that buy their enterprise developer license.”

Another difference between public and private is the approval process for new apps and updates. For example, instead of waiting seven business days or longer for Apple to review an app and release it to the App Store, Medtronic handles that task for internally developed apps, and it guarantees its business units that the approval will take no more than five days.

IBM and Medtronic both built their app stores from scratch. But that’s not a viable option for smaller companies that don’t have the internal resources to build their own. That’s why vendors such as Nukona are offering stores on a white-label basis, where enterprises simply add their branding.

Either way, one consideration is protecting confidential data -- not just what the app stores on the device, but also the data in the corporate network. After all, an app essentially is a door into the enterprise. Some mobile OS’s make it easy to share apps between devices, so one way to mitigate that security threat is to use a form of multifactor authentication.

“During the time of distribution, we inject a specific ID into that distributed app so we know that it’s this app with this user on this device,” says Perret.

Bring Your Own Device (BYOD)
Private app stores also give enterprises a way to deal with the OS fragmentation that occurs when employees are allowed to bring their own tablet or smartphone instead of receiving a company-issued device. Nearly half of all employee smartphones worldwide are already brought by employees, says the research firm Strategy Analytics.

As a result, some enterprises have a single app store for multiple OS’s. For example, IBM’s WhirlWind store is a one-stop shop for Android, BlackBerry, iOS and Windows apps. “We recognized early on that there needs to be some commonalities, a single go-to place for folks to find and get mobile applications,” says Bill Bodin, IBM’s CTO for mobility.

With very few exceptions -- such as some government agencies -- private app stores don’t prevent users from accessing the public stores. That’s partly because the enterprises don’t want to undermine the value of smartphones and tablets that employees pay for, even when their employer reimburses them. But enterprises do sometimes segregate the two. “So when people are trying to get to their personal data and the features they pay for, they’re not encumbered by the enterprise challenges for increased authentication, etc.,” says Bodin.


Language Lessons: Where New Parallel Developments Fit Into Your Toolkit

The rise of multicore processors and programmable GPUs has sparked a wave of developments in parallel programming languages.

Developers seeking to exploit multicore and manycore systems -- the latter involving hundreds or potentially thousands of processors -- now have more options at their disposal. Parallel languages making moves of late include the SEJITS of University of California, Berkeley; The Khronos Group’s OpenCL; the recently open-sourced Cilk Plus; and the newly created ParaSail language. Developers may encounter these languages directly, though the wider community will most likely find them embedded within higher-level languages.

Read on for the details:

Scientific Developments

Parallel computing and programming has been around for years in the high-performance scientific computing field. Recent developments in this arena include SEJITS (selective, embedded, just-in-time specialization), a research effort at the University of California, Berkeley.

The SEJITS implementation for the Python high-level language, which goes by ASP (ASP is SEJITS for Python), aims to make it easier for scientists to harness the power of parallelism. Scientists favor speed as they work to solve a specific problem, while professional programmers take the time to devise a parallel strategy to boost performance.

Armando Fox, adjunct associate professor with UC Berkeley’s Computer Science Division, says SEJITS bridges the gap between productivity programmers and efficiency programmers. SEJITS, he notes, allows productivity programmers to write in a high-level language, a benefit facilitated by efficiency programmers’ ability to capture the parallel algorithm. Intel and Microsoft are early-adopter customers.

Here’s how it works: A scientist/programmer leverages a specializer -- a design pattern, essentially -- that addresses a specific problem and is optimized to run in parallel settings. Specializers that are currently available cover audio processing and structured grids, among other fields. This approach embeds domain-specific languages into Python with compilation occurring at runtime.

ASP specializers are available via GitHub, with a planned repository to provide a catalog of specializers and metadata. The beginnings of such a repository may be in place by December, says Fox.

“As more and more efficiency programmers contribute their patterns to this repository of patterns, application writers can pick up and use them as they would use libraries,” explains Fox.

Fox characterized SEJITS as a prototype -- albeit one with customers. He says researchers are working to get SEJITS documentation more complete.

Tapping GPUs and More

Stemming from a graphics background, OpenCL appears to be broadening its reach after emerging in Mac OS X a couple of years ago.

OpenCL, now a Khronos Group specification, consists of an API set and OpenCL C, a programming language. On one level, OpenCL lets programmers write applications that take advantage of a computer’s GPU for general, non-graphical purposes. GPUs, inherently parallel, have become programmable in recent years. But OpenCL’s role extends to increasingly parallel CPUs, notes Neil Trevett, vice president of mobile content at NVIDIA and president of the Khronos Group.

“Historically, you have had to use different programming frameworks for programming ... CPUs and GPUs,” says Trevett. “OpenCL lets developers write a single program using a single framework to use all of the heterogeneous parallel resources on a system.”

Those resources could include multiple CPUs and GPUs mixed together and exploited by a single application, he adds.

OpenCL’s scope includes multicore CPUs, field-programmable gate arrays, and digital signal processing. The basic approach is to use OpenCL C to write a kernel of work and employ the APIs to spread those kernels out across the available computing resources, says Trevett.

OpenCL C is based on C99 with a few modifications, says Trevett. Those include changes that let developers express parallelism and the removal of recursion, he notes.

OpenCL emphasizes power and flexibility versus ease of programming. A programmer explicitly controls memory management and has considerable control over how computation happens on a system, says Trevett. But higher-level language tools and frameworks may be built upon OpenCL’s foundational APIs, he adds. Indeed, Khronos Group has made C++ bindings available for OpenCL.

Trevett says the C++ bindings will make OpenCL more accessible. In another initiative, Khronos Group is working on an intermediate binary representation of OpenCL. The objective is to help developers who don’t want to ship source code along with the programs they write in OpenCL.

 A Broader Take

Earlier this year, Intel set its Cilk Plus language on an open-source path as part of the company’s effort to make parallel programming more widely available.

Cilk Plus is an extension to C and C++ that supports parallel programming. Robert Geva, principal engineer at Intel, notes that Intel first started with implementing Cilk Plus into its compiler products. Then, after gaining initial success with customer adoption, extended this to its open source efforts by implementing Cilk Plus into the GNU C Compiler  (GCC) through a series of releases.

The Cilk Plus extension to C/C++ aims to provide programmer benefits via allowing composable parallelism, and allowing utilization of hardware resources including multiple cores, vector operations within the cores, while being cache friendly.

Geva says that Cilk Plus provides a tasking model with a user level “work stealing” run time task scheduler. The work stealing algorithm assigns tasks -- identified by the programmer as capable to execute in parallel with each other -- to OS threads. According to Intel, the dynamic assignment of tasks to threads guarantees load balancing independently of an application’s software architecture. This approach to load balancing delivers a composable parallelism model.  That is, the components of a large system may use parallelism and come from independent authors, but still be integrated into a single, parallel application.

Geva says that this solves a problem for those developers who were trying to build complex parallel software systems without a good dynamic load balancing scheduler and encountered hardware resource over subscription and, therefore, poor performance.

The re-implementation of Cilk Plus in open source GCC is intended to help with adoption. Geva says the open-source move helps with adoption by two types of developers: one group that prefers the GCC compiler over the Intel compiler, and a second group that is comfortable with the Intel compiler but would like to have another source..

The first components of Cilk Plus to be released into open source includes the language’s tasking portion and one language construct for vector-level parallelization (#pragma simd). The tasking portion includes compiler implementation for three keywords, including _Cilk_spawn, _Cilk_sync and _Cilk_for; the runtime task scheduler; and the hyperobject library. The remainder of the language will be introduced in multiple steps.

Porting to GCC will also help with Intel’s standardization objectives. The current plan is to take Cilk Plus to the C++ standards body and work on a proposal there, says Geva.

“We will be in a better position working inside a standards body with two implementations instead of one,” he explains.

A High-integrity Initiative

A newly launched language, ParaSail, focuses on high-integrity parallel programming.

Tucker Taft, chairman and chief technology officer at SofCheck, a software analysis and verification firm, designed the language. The alpha release of a compiler with executables for Mac, Linux and Windows emerged in October. Taft says the compiler isn’t intended for production use, but can be used to learn the language.

“Right now, we’re just trying to get it out there and get people interested,” says Taft.

According to Taft, creating a parallel programming language from scratch gave him the opportunity to build in safety and security. The language incorporates formal methods such as preconditions and post-conditions, which are enforced by the compiler. That approach makes ParaSail “oriented toward building a high-integrity embedded system,” notes Taft.

In another nod to secure, safety-critical systems, ParaSail eliminates memory management via garbage collection. Taft says garbage collection isn’t a good match for high-integrity systems, noting the difficulty of proving that a garbage collector is “correct.”

“It is also very difficult to test a garbage collector as thoroughly as is required by high-integrity systems,” he adds.

Taft’s experience in the high-integrity area includes designing Ada 95 and Ada 2005. Defense Department once made Ada its official language, citing its ability to create secure systems. The language has found a continuing role in avionics software.

Similarly, ParaSail could cultivate a niche in aerospace. Taft cites the example of an autopilot system for a commercial jet. He also lists systems for controlling high-speed trains, medical devices and collision avoidance systems for cars.

As for distribution methods, Taft says he is working with other companies, including one with a close association with the GCC. Taft says hooking the ParaSail front end -- parser, semantic analyzer and assertion checker -- to the GCC back end would be a natural way to make the language widely available.

Another possibility: making ParaSail available as a modeling language. In that context, ParaSail could be used to prototype a complex system that would be written in another language.