The Harth Programming Language

Last Update: February 29, 2016

Why The Name “Harth”?

Harth is a contraction of Harvey Thompson. That way people can be sure who the original author was.

The name also evokes the idea of a “Hearth”, a warm and cosy often industrial sized fire-place in the home or blacksmith’s; here food is cooked or new tools are forged.

Hearth

Yet Another Programming Language?

The nice thing about standards is that there are so many of them to choose from.

Andrew S. Tanenbaum

I’ve used many programming languages in my 30 years of programming. Each of them have some good and not-so good parts. I’ve often wondered what would happen if you took all the reasonable bits of each and made a programming language from that. Harth is that experiment.

Hopefully Harth will not turn into a Jack-of-all-Trades (and master of none), nor a confusing Swiss-Army-Knife programming language (the sort you use if you can’t find your professional toolbox).

See also Mike Taylor’s: “What is your Favourite Programming Language?”

Mine would probably be Emacs. Wait, Waaa? Emacs is actually a Lisp Programming Language runtime disguised as an extremely customizable editor.

Vi-Schmi. Visual Whatevers.

Programming Languages I Borrowed Ideas From

In building Harth I’m borrowing some of the “good” ideas from various existing programming languages. Each of them are well worth learning and using in their own right. I’m also very aware and am grateful for the massive amount of work put in by all the clever people behind each of these programming languages.

Programming languages I’ve actually used:

Programming languages I’ve not really used, but have read about interesting ideas worth borrowing:

Find Out More

  • Read the rest of this page.
  • Read the Blog posts.

TODO

Implement these later as auto-builds from markdown docs and Doxygen code from source.
  • Read the [Language Reference] - uses Markdown/Hugo.
  • Read The [Language API] - uses Doxygen.

Project Background (September 14 2015)

The high level vision for the project is to:

  • Help software engineers make great tools for themselves and other software engineers.

Software Engineers (programmers) use many tools, the most important of which is essentially a high tech type-writer; the code editor. This is something like Word (or Notepad), and they write text (code) which is saved in text files. There are many other tools which provide some additional functions that help programmers. These features may already exist in in the code editor, or may find found in a separate tool.

Other Programmer Tools

Some of the tools or features a programmer needs are;

  • Find Documentation - Less of the “ARGH Help How do I? Sheesh, Where is the help?”
  • Find Definition - The programmer can look up definitions of particular piece of code, much like a dictionary or thesaurus.
  • Find References - The programmer can find all the references to a particular piece of code, much like an index or table of contents.
  • Search/Replace & Refactoring - The programmer can ask the computer to make specific global changes to the code. For example:
    • Changing the name of something everywhere.
    • Moving some code to somewhere else.
  • Auto Complete - The programmer can type short cuts and expand them, like auto-complete on phones.
  • Error Highlighting - The programmer can see errors in their code almost immediately, like the red squiggly lines in Word that show spelling mistakes.
  • Compilation - The programmer can compile the program into code, and at this point will be told about more problems in what they have written (like a human editor would for a book or news story).
  • Debugger - The programmer can run their program step by step and see what’s going on (debugging); often this is needed because the programmer has made a logical mistake that the computer cannot possibly detect.
  • Syntax Highlighting - The text is styled and coloured to highlight important information: much like headings in documents.
  • Interactive Session - The programmer can type additional pieces of code and perhaps modify the code while it’s running, to help correct problems (bugs) in their code.
  • Unit Tests - The programmer can write extra “test” code that checks what they have written; this can be run automatically to catch problems which are introduced as they change their program.
  • Quality Tests - The programmer can use some automated code to evaluate code quality, for example;
    • How much of the code has tests.
    • How much of the code is documented for programmers.
    • How complex the code is.
  • Issue Tracking - The programmer can create “To-Do” lists of things they need to do later; building software is like growing a city - you start with a one house, but very quickly there’s 1001 buildings, each with 1001 different problems to solve.
  • Change Management - The programmer can save versions of code, both locally and on the web, and use this this to collaborate with others.

Woah, Too Much!

There is an almost endless list of tools a programmer might need. Consider Home Depot and how many hammers, drills, spanners, screwdrivers, rulers, saws might be needed to build and maintain a house. Likewise there are 1001 tools a programmer might need to use to build good software. Not all the tools exist in one place, and not all the tools are as good as they could be, or easily available, or cheap.

My aim is not to provide all 1001 tools. That task is far too big for one person to achieve.

But we have to start somewhere.

Long Term Aims

So a more realistic aim for this project is to:

Provide a great foundation for many programmers to write and share good tools a little easier, as and when they need them.

Currently there has been an explosion in software developers creating new programming languages out of the desire to solve the cohesive lack of good tools. “Yet Another Programming Language” is announced, but most software developers groan at each solution, but can’t quite understand why. A few have discovered that we’re often repeating the same patterns and even mistakes. We have to try to think a little differently, dig for the hidden gems of ideas that a few smart people have discovered, some of these could make a big difference.

Medium Term Aims

So the more focused aim of this project is to:

Try to collect the most modern and alternative ideas in one place, to try something a little different.

For example:

  • Optionally talk about types as much as possible:
    • Mostly static, or inferred static, with a dash of dynamic if you wants it.
    • Types are inferred if you assign something that’s of a known type.
    • Infer return type.
    • Infer generic types.
    • Closures types are inferred most of the time.
    • Infer argument types (that can be hard, but use generics?)
    • If you “do not care yet”, use dynamic which essentially says “call method/function dynamically with cache runtime checks, your code may crash”.
    • Dynamic is in C# and a few other Net based languages you know!
    • See Mike Taylor’s Static And Dynamic Typing
  • What if we didn’t write code as text saved to a text file?
    • Well that’s not good for things like git, diff, emacs, vi and a billion other text processors!
  • What if code was stored in a database like system?
    • SQL and anything like that actually is too much.
    • What is the fundamental thing that they actually provide. Only have that thing.
    • Hint: Probably Guids (Globally unique IDs, that is keys) for names.
    • Hint: Possibly a way to link things in relationships (definition/use etc).
  • What if code was more like the internet, a data rich, textual, graphical mixture?
    • HTML links in your code. What nonsense.
    • Autobuilt links like you get on some web based source databases.
  • What if the programming ecosystem itself fully understood itself?
  • This idea has been around since the 1960’s, but most languages only go so far; providing access to understanding only the high level constructs.
  • Provide full 100% understanding of all code as a library (Microsoft are doing this with something called Roysln, which is part of their next Visual Studio 2015 release).
  • Provide full 100% ability to convert text to executable code, and debug as a library (Apple use something called LLVM which does this at a lower level as part of their XCode development system).
  • What if the programming ecosystem ditched manual processes that just make the human’s life harder and more miserable?
  • What if the programming ecosystem automated more processes so that the human can concentrate on the creative side more?
  • What if the programming ecosystem helped the human as much as possible to read and write good code in the first place?
  • Code completion should eventually mean you actually type less.
    • Code completer could insert “sensed” meaning, eg. inferred types.
    • Code completer means this:
function DoubleInt(x : Int) -> Int
{
    return (x * 2)
}
function PairDouble(x : Int) -> (Int, Int)
{
    return (DoubleInt(x), DoubleInt(x))
}
  • Roughly: 8 Lines of 24 words, of 145 keystores.

  • Might only be typed as (where space is also auto-complete):

fun DoubleInt(x : I) - I ret x * 2
fun PairDouble(x : I) -> (I, I) ret ( di (x), di (x) )
  • Roughly: 2 Lines of 24 words of 90 keystokes.
  • That’s roughly only 62% of the characters.

    • Longer names are only required in full to stop ambiguities.
    • Fuzzy matching, and case-insentive camelCase splitting matching at work.
    • Example typing d + i completes to DoubleInt after first defining it.
    • The fuzzy match dulbin would also match DoubleInt since its the only thing that uniquely contains those characters in roughly the right order.

    • Ambiguities should provide a text-sensitive helper pop-up in GUIs. Much like most decent editors.

Still this is a quite a lot of work, so we need a more realistic shorter term aims for this project.

Goals For 2015

The goal for the end of year 2015 is to:

Build a new prototype programming environment based on more new and modern ideas learned from experience and academia.

These include:

  • Familiar programming language - a programming language that will be fimiliar to most people; built on top of C++, but with ideas taken from Java, C#, Swift, Erlang, Scala, Lisp, Smalltalk, …
  • Static type system - the programmer must be clear and explicit about what the program is doing.
  • Use the static type system to solve more problems:
    • No null pointer exceptions - “The Billion Dollar Mistake” has plagued us for too long, and most more modern languages are solving this quite neatly.
    • Some people don’t see this as a big deal. Sure:
    • If you check all your pointers in Java, C#, C++.
    • Or you use references (in C++) or stack values.
    • Or you remember to anotate them with @null and @notnull (or whatever).
    • Pointers, who needs ‘em.
    • Optionality is useful for saying “Zero or One of X”. Example “ParseInt(String) -> Int? either returns the integer or null meaning you lose). Think of it as a soft-error return.
    • The static type system encodes the optionality in the type, and you can’t get round it without being explicit (ie. I “assert” it cannot be null).
    • Programs always can crash :-(
    • See Error 1.
    • Non-exceptional run-time error handling - the programmer must explicitly and consciously handle the problems user’s might face as close to the source as possible.
    • Exceptions were initially thought of as a good solution, but have proven to be a “bad” idea, though many people havn’t realized or agree on this yet.
  • Garbage collection - the programmer is absolved from worrying about memory management, computers are much better at doing this labor intensive task without error. This is contentious with some programmers; the answer to that is you shouldn’t do this for low level programming, but we’re not dealing in this area, so it’s good and right to do so.
  • Concurrency - modern computers have multiple processors, to use these effectively the programmer needs an easy way to use the CPUs effectively.
  • Use task/actor systems that share only immutable data, no locking and no mutation means the programmer can more easily understand how to do this well and without bugs.
  • Self awareness - The programming language provides a library to understand itself fully:
    • Convert text to an in-memory editable data structure.
    • Convert this in-memory structure back to text.
    • Analyze these data structures for full meaning, cross referencing, and compile time (programmer) errors.
    • Provide layers on top for editors, scripting, compilation, debugging and interactive features that are commonly needed.
  • Prototype - showing some of the basic required features listed in Background earlier, showing how easy such tools are to build given the foundation:
    • Find Definition
    • Find Reference
    • Auto Complete
    • Error Highlighting
  • Prototype stretch - show some cool and mandatory but important features to prove it really can work:
    • “Google Maps” Code View - this is an “exciter” in that I’ve not seen it done in many products, only hinted at in research.
    • Exporter
    • Compilation
    • Interactive Session

Status

On September 14, 2015

Rough completion for this years goal (not for a full final implementation) from the above list:

  • Familiar programming language - done
  • Static type system - done
  • No null pointer exceptions - done
  • Non-exception run-time error handling - done
  • Garbage collection - 80%
  • Concurrency - Deferred; this problem is complex to implement, not required to progress, and has already been proven to work very well on other projects.
  • Self-awareness - 50%
  • Prototype - 0%
  • Prototype stretch - 0%

Some stats, which if you’re aware of programmer productivity is very high. However I’ve been careful to work only about 8 hours for 6 days of the week to avoid burn out.

  • Lines of code written to date (from Jan 1st): 145,000
  • Average daily line count: 650
  • Number of code files: 25300
  • Total committed changes: 2670

On October 27, 2015

I’ve completed the Prototype goals already two months ahead of schedule! :-D

Info

Though not the stretch ones though, they were optional; they’ll eventually get done when it makes sense.

The plan was to finish the prototype in November and December, but though it’s taken nine months to program the “brains” of the project, once that was done it took only two weeks to implement the prototype features as extensions for an existing editor (Emacs).

On Feb 29, 2016

The Harth website goes live.

  • Initially just the home page.
  • Eventually will host Harth blog.
    • Organized by date.
  • Eventually will host Harth Technical Reference and API documentation.
    • Organized by language and version.
    • Automatically built from Harth code itself.

Financing and Business Plan

This project is NOT a business. Essentially I’m on sabbatical from work and doing research and development using some savings.

This project is entirely self funded. I’ll run out of money sometime in 20162017.

I have some ideas on a business plan; for now I only need money to live and eat.

The short term plan:

The long term plan (subject to changing my mind on a whime) is:

  • The product will should hopefully “almost” free, as in peanuts.
    • Probably a yearly subscription.
    • Think Atlassian Jira “free” model.
  • It will remain under my control (which means you can give me lots of cash, I just get to keep 51%).
  • I reserve the right to sell it to whoever I like for one trillion-billion whatevers. I’m not greedy however and thus:
    • Head towards inverse tithing (I keep 10%, give away 90% to worthy causes/endeavours).
    • I’d probably only buy a reasonable house and car.
    • Do lots of cool things for education, software and charity.
    • I may occasionally splash out on a nice holiday or a first class plane ticket.
  • I’m well aware that the likelihood of this leading to success is minimal.
    • Failing this, I’ll go get a normal job making software controlled widgets for space drones.

Source and Binary Products

Currently the prototype(s) are closed source, the code and built products are internal prototypes for research and development.

The projection for first usable alpha is roughly somewhere within 2016 to 2017 (plus or minus as many years as required). Translation: I have no idea really, other than that’s a huge undertaking.

Obviously if a few people help out (especially full time), this might be reduced very slightly.

What Does The Prototype (V0.2) Demonstrate?

This has essentially demonstrated one of the ideas; by providing the “brains” as components, programmers can write better tools.

I can demonstrate editing a very simple program with the following features:

  • Syntax error checking as you type;
    • Editor puts red squiggly lines on erroneous text, with explanations of errors on popups.
    • (For prototype only updates on save).
  • Find definition;
    • editor jumps to definition of type.
    • (For prototype, works only on types).
  • Find references;
    • editor can cycle through all references to type.
    • (For prototype, works only on types).
  • Auto-complete names;
    • editor either completes unique name or gives pop-up list of possible completions to select.

Note that the features adapt to changes in the program as you edit it - the “brains” re-analyses the whole program as it’s been written.

Technical Details (V0.2)

  • Create a standalone “server” process which reads, parses and analyses a whole project from disk.
  • By connecting over TCP and sending commands, the “server” responds to requests for information about the project.
  • The “server” can update (reload at the moment, eventually incremental updates) the analysis of the whole project files.
  • Extended Emacs (a standard well known programmable editor) to talk to this server process.
  • Extended and adapted existing Emacs packages (flycheck, ggtags, company) which provide the front-end/UI/Emacs for virtually “free”. (No point reinventing the wheel here).

Not everyone uses Emacs as a program editor, but it would be fairly easy to write plugins for other editors; most of the work is done in the server process (for example: you can simply use telnet or netcat to talk to the server, it’s just a TCP text service like a web server). The “server” should really also provide information to format/colour/indent code - and/or possibly provide standard editor component.