From: Sam Watkins <swatkins@fastmail.fm>
To: kirk@strauser.com, ron.l.johnson@cox.net, pecondon@mesanetworks.net
Cc: debian-user@lists.debian.org
Bcc: 
Subject: Re: OT: down with memory protection!
Reply-To: 
In-Reply-To: <200412021741.33834.kirk@strauser.com>

> No, they don't.  Memory protection is generally a free side effect of 
> virtual memory (not to be confused with "swap").  That is, virtual memory 
> is the goal, and memory protection is a good thing that comes along with 
> it.

I am arguing that if we only run machine code that is known to access memory
correctly, we will no longer need the virtual address space abstraction or
memory protection, and the system could be made considerably simpler and faster
as a result.

It is possible to have both paged virtual memory and one big shared address
space without inter-process memory protection.  Memory protection would only be
required to trigger the page swapper, and switching between processes would not
require remapping any memory.

> Congratulations - you've just invented shared memory (see shm*(2) for 
> details on the Linux syscalls that implement it).

> See "man 2 mmap" for details.

No need to be sarcastic.  I'm aware of shared memory and memory-mapped IO, for
example X11 can use shared memory between the client and server processes to
make communication of pixmaps, etc more efficient.  Most programs do not make
use of these extensions.  I am proposing to use these machanisms as the
"normal" way to do IPC and IO in an alternative (non-unix) operating system,
and to share buffers and objects that are smaller than a page.  Mapping files
would remain a page-resolution operation.

> > I think people don't normally use more than 4GB of VM on 32-bit
> > computers, at least they won't now that 64 bit CPUs are taking off,
> 
> Not even close to correct, actually.

Ok, I am simply saying that 4GB of VM is sufficient for a 32-bit computer,
there are a lot of things you can do with 4GB of VM.  If anyone wants to use
more than 4GB of VM with my hypothetical system they can get a 64-bit computer
as the system will not remap memory when switching between processes.
Switching between processes will be about as fast as a function call.

> > and the compiler and programmer could be required to prove that the code
> > would not loop forever without calling yield.
> 
> This is provably impossible.  Reference "the halting problem".

Nonsense.  Turing proved that it is possible to construct an algorithm which
cannot be proven either to halt or not to halt.  It is also possible to
construct sentences which cannot be said to be either true or false.
  1. most algorithms are not like this
  2. one can simply call "yield" inside a repetitive part of the algorithm
     which has not been proved to always terminate

We shouldn't abandon the idea of proving that our algorithms are correct simply
because some pathological algorithms cannot be proven correct or incorrect.

> Sorry, but your ideas have pretty much already been implemented (and in some 
> cases discarded).  :-/

I am not claiming these ideas are original.  I think synthesising them in this
way may be original, though (if it is doable).  Maybe not.  As for "discarded
ideas", plenty of discarded ideas turn out to have value after all in the end.

> Get a pre-OS X Macintosh and see how well you like it.  Seriously, it's 
> called cooperative multitasking, and it's generally not well-regarded.

I liked the Acorn, very much.  For the programmer, it is nice to know that your
process won't be randomly interrupted.

> Sam, to clarify: since the VM abstracts physical addresses away from 
> programs anyway, memory protection is largely a result of the fact that 
> programs simply cannot address memory outside of their abstraction.

I am aware of that.  This whole idea is about "context switching", and how
expensive (as in slow) it is, the main reason it is expensive is the need to
remapping memory in order to give each process its own address space.  The cost
of "context switching" has led to three different levels of multi-processing
being used in Unix software - processes, threads, and lightweight-threads or
co-routines (MS calls them fibers).  This complexity would be unnecessary if
inter-process context switching were cheap (as in fast), and it would be cheap
and safe if programs were proved or otherwise constrained to access memory
correctly.

> You can't write to another program's memory if you think that the whole
> address range belongs to you in the first place.  In effect, each program
> thinks that it owns the entire machine.

I know.  It is a lie, and an expensive one to maintain. 

> You think this guy is a CompSci student with just enough knowledge
> to be dangerous?

heh, thanks, I like to be dangerous.
Actually I flatter myself that I'm quite knowledgable.
Doesn't mean I don't have crazy half-baked ideas from time to time though.

> The several coroutines in your project can be implemented as very
> "light weight" processes. The context switch could be simulated (in
> C++, for example) by having each process be an instance of a process
> class. These could be arranged to each maintain its own internal
> state, and each give up control of the CPU on a regular basis.

Yes, this is the way - I am talking about making an system and language which
uses this mechanism as its one and only process abstraction, yet it can still
be multi-user and run source / bytecode from the net because the code declares
how it accesses memory and this is checked by the compiler before creating an
executable, and the loader ensures that the process is only accessing what it
has permission to access.

> Buggy code is the real world. Hardware that sets limits on what software is
> allowed to do is essential. Mostly memory protection hardware allows one to
> get a memory dump while there is still enough evidence in the dump to allow
> one to figure out what went wrong.

Code that accesses memory incorrectly, and needs hardware to limit what is can
do, is the real world of contemporary C and assembly language programming.  The
JavaVM in a browser, however, runs lots of class files from different sources
with different security restrictions without using memory protection, because
the language design guarantees that programs are not able to access memory that
does not belong to them.  Java is slow and a memory hog, but this is not a
necessary consequence of software memory protection.

> Also, hardware _does_ malfunction. Hardware checks and redundancy
> helps catch hardware malfunction.

That is a good point.  Memory protection might limit the destruction caused by
faulty hardware.  I don't think this on its own is sufficient reason to abandon
this idea, however.  If your ram is hosed you'll likely want to restore from a
backup anyhow.

> > In contrast, programs written in python, Perl or Java cannot peek or poke
> > at other processes (unless there is something wrong with the language
> > implementation). 
> 
> So, memory protection is needed on the computers on which new versions of
> these languages are tested. But, in the open source community that is all
> user computers.

Supposing we had a decent reliable base language without no creeping featurism,
and the python, Perl and Java engines were implemented in that base language,
this would not be a problem.

> Most software, the vast preponderance of software, has no proof as to
> its correctness.

I am only talking about very weak correctness: correctly bounded memory access,
and guaranteed termination of some small routines.  This isn't like business
logic level correctness.

> > The need for preemptive multiprocessing could be reduced or eliminated in a
> > similar way.  Programs could call "yield" every now and again, and the compiler
> 
> Multiprocessing is not necessary for successful computing, but it is
> very nice to have it as the complexity of the problems that you are
> trying to solve grow.

I am talking about using non-preemptive multiprocessing, not abandoning
multiprocessing altogether!  It would still be possible to interrupt a "hung"
process, but this would not be the normal scheduling mechanism.

> > Without the kernel/userland distinction, yield and other "syscalls" could
> > be implemented cheaply.
> 
> In the real world, multiprocessing and memory protection are implemented
> cheaply. (By cheaply, I mean at low cost.)

by cheaply, I mean fast.  "syscalls", specifically yield, would execute as fast
as a function call.  It should be possible to switch to another process at
around the cost of a virtual function call.

> > Regarding "the stack", I would like to see the demise of the stack
> > altogether!
> 
> The stack is a really important innovation in organizing algorithms and
> thinking about algorithms. Without it, one cannot think about recursive
> descent. If you think it would be nice to see its demise, you have probably
> never confronted a real algorithm. 

I meant I would like to see the demise of the monolithic unix/C style process
stack, not stacks in general.

Creating a new coroutine is expensive because each coroutine requires its own
stack, and stacks are big.  You can't have a stack smaller than a page, they
are usually created much bigger, and the stack grows by page-faulting.

I believe there are better ways to implement nested function calls and
recursion than to use one big stack per process.

"stackless python" has succeeded in abandoning the one-big-stack-per-process
model.

I think recursion is overrated, and that it is better to use non-recursive
routines that make use of one or more explicit stacks or queues.  This is more
generic than recusion, it is easy for example to iterate over multiple trees
independently in a natural manner, or to change from depth-first to
breadth-first order simply by substituting a queue for a stack.  It is also
potentially more efficient than recursion, as it is not necessary to store
cruft such as lots of frame pointers and return addresses.

The stack concept, although important, is hardly an innovation.  It is also
vastly overused and overrated because C makes it compusory.

> I think Acorn has a stack, and its op-code set would be a shambles without
> it.

yes it does, but the ARM CPU does not have a concept of a system stack.
There is a convention that R13 is used as the stack pointer.

> Non-preemptive multitasking is OK on a computer that can boot up in a
> flash after a crash, except that after a quick reboot, you have no idea why
> the crash happened or what to do to fix a problem.

In later versions of non-preemptive RISC OS there was a way to generate an
interrupt that would stop the current process and let you kill it.  Also, an
alarm could be set by the scheduler to preempt and stop malfunctioning
processes.

I am not advocating the abolition of interrupts!

> The cost of compute hardware is in a long term downward trend. The cost of
> computer software is driven by the time it takes to write it. (and assuming
> one pays programmers a living wage.) A language to help write software needs
> to address the root causes of software cost.

I think the main reason it's hard to write software is that we don't program
using small-scale processes (objects with a thread of their own).  Simple,
active modules are highly reusable, as the unix shell has proved.  Normal,
inactive objects such as one finds in "object oriented programming" are not
enough.

> One can get buggy software to run on a computer with memory protection faster
> than getting it to run on a computer without memory protection.

One could compile half-baked software with runtime bounds checking.

Most high-level programs can be written using existing modules (such as the C++
STL) which implement safe containers.  Iterating over a container can be done
safely with a generic "for each" loop (but a proper "for each" structure can't
be implemented in standard C++ except as a macro that uses a GNU extension!)

> I think the combo of memory protection and buggy software will win the
> competition against other approaches.

A language that is as fast as C but can be proved to use memory correctly would
be popular with people who care about security.


Sam