Video streamVideo streamVideo stream
Question indexQuestion indexQuestion index

Interview transcriptInterview transcriptInterview transcriptDiscuss this interviewDiscuss this interviewDiscuss this interview
Jim Miller - CLR Architect, Microsoft CLR Team

Jim Miller holds a PhD in Computer Science from MIT (Parallel Processing under Bert Halstead), and served on the faculty at Brandeis University as well as on the research staff at MIT (both the AI Lab and the Lab for Computer Science). He has been on the research staff at Digital Equipment Corporation and the Open Software Foundation. Before joining Microsoft, he was on the senior management team of the World Wide Web Consortium, reporting to Tim Berners-Lee and in charge of work on security, electronic commerce, child protection, privacy protection, accessibility, and intellectual property protection. He joined Microsoft in 1998, leading the program management team for the kernel of the .NET Common Language Runtime (CLR) where his responsibility included garbage collection, metadata definition and file formats, intermediate language (IL) definition, IL-to-native code compilation, and remote objects. He also served as editor for ECMA TC39/TG3 and ISO/IEC 23271:2003, which are the international standards for a Common Language Infrastructure. His current work, as software architect for the CLR, involves designing an architecture to allow innovation in the core of the CLR and the managed Frameworks while preserving backward compatibility.

So, Jim, tell us a little about whom you are, what you do, you responsibilities, your role at Microsoft is?

Well, my name is Jim Miller, I'm a Software Architect in the common language runtime. I came from the World Wide Web Consortium 6 years ago where I was in charge of technology and society. So, I used to do stuff with child protection, and security, and whatnot commerce. I came here to get the manage code movement started at Microsoft. I've been on that team for six years. I started as a program manager in charge of all the low level gunk in the system and then moved up to being an Architect, mostly focused on future directions, where we're going, and Longhorn, and beyond. That's a little bit of my background there.

So, on a daily basis? What does a CLR architect do? What are you doing, day in, day out, morning?

The joke is: I think great thoughts. The Architect job is a loosely defined role at Microsoft. We're working on concentrating right now is on what we call, The Versioning Story, how we're going to make it possible for applications written today, or for that matter written with the past two versions of our framework have already shipped, to continue to work in the future for the next, we're hoping 15 to 25 years, without changes to the code and as we gradually move the framework forward because we want to innovate and move forward the whole manage code world on all of those frameworks, we don't want to loose the apps. And it's something that Microsoft, as other companies have done for many years, in an unmanaged world, in a flat API world, in doing it in a world with a heavy object oriented infrastructure with a lot of subclasses and virtual methods has really never been done before.

Generally, you get the first version out, if you're lucky you get a second version out. Then innovation stops in what you ship and eventually innovation only happens in newly added things. We want to be able to change that paradigm. We want to be able to innovate in the middle there without disturbing the app.

So basically you're saying you're responsible for the global assembly cache and stuff like that?

"Responsible" of course is not a word for an Architect, we are never responsible for anything, thank you very much. But, indeed, my job is to figure out the future of the GAC, it will change form rather dramatically, it's function's going to change, how it works will change, it's whole invocation will change. The entire way we are dealing with configuring applications for versioning purposes; all that will change as we move to Longhorn and beyond.

The profound effect is that as we move in to the operating system, as the operating system, is written and managed code, it forces us to think about how applications in the operating system interact differently. So, we have to address all of that; make a much, much simpler story, we think, than the current one.

You mentioned, when you talked about what you're working on as an Architect, you mentioned configuration. Tease us a little bit. Tell us what you're working on in terms of configuration. Are .config files going to go away? What's cooking back there?

Ya, so let me be very careful. Configuration covers a very, very wide breadth of things. It covers any kind of flag you would set to control the behavior of some underlining piece of the system. I'm not really worried about that. There are other people who worry about it; how general configuration will work.

I'm worried about the part of the configuration that says, this app was written for version 1.0, so run it as though it were 1.0, versus, this is written as 1.1, lock it down, that sort of thing. So that part of the configuration, that part, right now is supported runtime, required runtime. Those pieces; that really were my work is.

So there's more beyond publisher policy. There's more of that kinds of stuff?

In fact it's, hopefully it's less beyond publisher policy. We want to get rid of publisher policy and the future goal--if I had to put it in a pithy little sentence--is that you should never have to configure your apps. That should just happen automatically, for this purpose.

Obviously, you have to configure things like were the server is, those kinds of things are not going to go away. But from the point of view of which version of the runtime I was running on, in almost all cases we expect that that will just automatically be taken care of. And then in the exceptional conditions, we want to very simple system for you to say, Okay, it didn't work on to the normal things, here's what you have to do.

So we're hoping, in point of fact, that developers almost never have to worry about this. They will to a certain extent, they will have to change there thinking just a tiny bit about how they write their code, but they don't have to worry about the actual details of configuration. And administrators, almost never have to deal with either, but when they do, it's a very simple story, easy to manage.

So how far out are you, ahead of the rest of us? Not necessarily in terms of days or months. But how far ahead of us, in the sense of generations? When will we see what Jim Miller is working on today?

Okay, so in the typical Microsoft fashion, I won't say how far I am in days, or months, or years. The things that you see that will be a major impact on Longhorn operating system release, and the CLR that goes with that, which is codenames Orcus. The next version of the runtime, codename Whidbey, really almost has nothing to do with this. It's the one after that that has most of this work and the going on, beyond there. So, I'm focused on Longhorn and beyond.

It's obvious that Microsoft has put significant amount of effort into the standardization commitees. I know that that was a significant part of your role as an architect on the CLR. Why?

Very good question. Why do standards at all? Actually, it's a very complicated question and the reasons change with time. The original reason was really two fold. One as to make a very strong statement that this is a platform that's open for everybody. That it's not going to be changed by Microsoft at its whim. That it's been subject to scrutiny by anybody who chooses to participate in the process to make sure that as we did it--it was very interesting.

We did something very unusual for Microsoft. We did the standardization while we were doing the development. We're also doing that for Indigo, but it was brand new thing at the time, when essentially, we were one of the pioneers for that.

We wanted feedback into the product process itself. So going to these meetings, talking to people who were not from Microsoft, who were not part of the internal politics, but were looking at the larger world. People who came from the Java world and wanted to make sure that they could understand the compatibility basis. It really was to make sure we had as broad an input as we could, and to make sure that it was really--we could understand the definition of the lowest of the lows of .NET, independently of the operating system. Independent of the processor, so it really is a level playing field for everybody. So that really was the single biggest factor.

And of course, it's always a delightful thing to say we have an international standard. It is then the base for other standards. We're just beginning to see that pay off actually. We find the digital video broadcasting standards, and the cell phone standards are beginning to say, Oh, there's an international standard for how to write an intermediate language that's language independent, maybe we should roll on to that, and so we're just beginning to see that standardization effort take off. So, it was the base of a lot of things that are coming now.

So, we have the second generation of the product coming, the .NET 2.0 framework. Will we see .NET 2.0 standards? And do you know a time schedules, when?

Tying a standard to the product is a very, very tricky thing. So we haven't been as careful as we can to say that those standards are independent. We try to have the standard come out around the product release, because that's the most sensible thing. It's hard to do the standard way prior to the product release, because the product is going to change and then you have a standard and a product that don't match. They loosely match, they don't precisely match.

There will be a version of the standard coming out. The current goal is that the commitee finishes its work in, I believe it's September of this year. The process is kind of interesting. You go: September the commitee that's doing the work; the technical group, closes work on it, hands it off to the next level in the system. They meet in December; have a final ratification in December, which would make an official standard available at the end of this year. Then they automatically forward it on to ISO, for an International Standard. And then there's a fast track process at ISO that can run anywhere from six months to a year. So about a year later we expect to have an ISO standard out of it as well. So that's the rough time line.

And what it will include--the largest single thing will include will be the specification generics. I should actually be clear about this. There are three related standards. Actually, now it's up to five related standards. The one that I work on is the common language infrastructure which is the lowest level one. And what drives, what causes us decide to put things in there, is basically, isn't needed for these standardized programming languages that won't want to work on top of us.

So we added generics because C#, which is a different standard, also from ACMA, also running on the same time schedule, needs to have them for their language definition. The language IFL is also being standardized by that group. JSript is being standardized by that group. And most recently manage C++ bindings to the common language infrastructure are being standardized.

So we have a family act of five standards, but not all on the same schedules. C# and CLI are on one schedule, the other three are on independent schedules. But what those four need drive what we put in to the CLI. Plus, what we see is really common usage patterns that are going to be effecting everybody who wants to use manage code.

I know you were involved, to some degree, with the release of Roder, the shared source implementation of the CLI. Obviously with the new release coming forward, are we going to see a Roder-Whidbey, Roder 2.0, whatever you're going to call it?

Ya, but we don't know what we are going to call it yet. But internally, we call it Roder-Whidbey. We announced this with the original one that remains true, that the way to think of Roder is it's just another build of the runtime. Just like we have builds for the server, builds for the client, we have a build for Roder. So it actually is the same source code, obviously it doesn't include all of the features.

So, what we're doing is we have the small team that's tracking the product team. So, as we build Whidbey, we're constantly building Roder as well. We're keeping that up and running. We're making sure it runs on all the platforms. We're making the decisions of what features go in and what features do not and keeping it alive in building.

Then the goal is, we hope, within six months of the product release of Whidbey, there will be a Roder release as well. That's not commited--we are aiming to have it co-released--practical matter, it will be three to six months after, but the goal is to co-release.

Obviously, as the architect, as any architect, I'm sure will agree, there's tremendous amount of heart and soul that you've put into the CLR. And as any proud parent, I'm sure you're happy, and justifiably so, with what has come out. But what aspects of the CLR, in retrospect, do you wish maybe you hadn't shipped. And I guess, more importantly, what would you like to see done, going forward? What new things would you like to see the CLR embrace?

Well, regrets? Did we put anything in that we shouldn't, at the level that I work on? So, below the class libraries. There are a couple little wrinkles in there that are probably mistakes. Fortunately, nobody built significantly on them, so they tend to quietly lie in wait and then bag us later.

There are a couple there that I would dearly like to see us actually go in and complete. Most interesting things to me, are ones that have to do with, obviously, my current focus on versionings, but there are a few features that will make it easier to move the code forward over time. There are a few features that will enable other programming languages--that's actually the passion that drew me into the common language runtime team in the first place--is a passion for programming languages.

I've done programming languages and development tools for about 20 years of my 25 year career and I really enjoy a wide spectrum there. The ones that I think we didn't do as well as we could have, but are actually quite important as well, are the scripting languages. We did some things to help them. Like Jscript has a very efficient implementation on top of us, which is very nice, but it took a fair amount of work to get it to the level it was.

The most recently, we were really gratified about, was to find out the Python community is beginning to take a look at the runtime. There's a really interesting article by Jim Hugunin, I think that his name is, that I absolutely adore because there's one paragraph in there that's just wonderful about how he started to write a paper about how hard it was to get a good performing Python, on top of the runtime, but to his dismay, it actually performs very well and in many cases it out performs even the C implementation. Nonetheless, we don't think it's doing a particularly good job for dynamic languages, so we want to put some stuff into it to make it even better for that purpose.

On versioning, I'll mention on of my pets, I would love if your listeners have any comments on it, I'd love to hear. One of the things that I see as a limitation right now is interfaces, which I love and I think they are a great way to go, don't allow you to put default implementations for you virtual methods. So you can define an interface and say that anybody who implements must implement this virtual method, but you can't give a default implementation. And yet there's lots, and lots, and lots of things where the default implementation is trivial to write in terms of just a few things that you'd have to write yourself.

I think of complex numbers as one of my favorite things. So, I might require you to figure out some way to do the real number part, the complex number part, then I can automatically give you the code for doing the other ones; the radius and a [theta] stuff. Anything like that, were just a few things you have to define, and them and all the rest is for you.

For versioning purposes, it's incredibly important, because one of the brittlenesses in the system today is that if I have to find a class, I can add a method to it, because I can provide an implementation, and your code is unaffected. You don't call it because it's a new method, that's okay [...] guys call, then that's all fine.

But I can't do that with an interface, because if I add a method to an interface, all the guys who implemented it have to implement it. Well, what if I want to add methods that are trivially implemented in terms of the other ones? I can't do it. It's actually a brittleness that we're finding in the system. So that happens to be a pet peeve of mine.

So those are the kinds of things that I want to drive from that point of view. From the point of view from dynamic languages; one of the things that's hard to do from a dynamic language point of view--by dynamic language, I mean one where the programmer doesn't write it in terms of types. You don't define types, you don't think about it that way at all, you just have variables and they have values.

Sort of like the Ruby the Python.

Ruby, and Python and Perl, and there's a number of those. We have one where we've talked about a little bit called Monad we're doing for scripting at Microsoft. They all have that same property, which is just, define the variable, you give it values, and the all of your operations deal with all of the major data type.

So you have an add operation, it knows what to do with the integers, that's obvious. It know what to do with strings, which is, they look like numbers, it prints them in numbers, adds them because it's string [mag] and that kind of thing. So, you write your language that way.

My own personal favorite because it's me background, is Lisp and Schem, I've been doing that for years. Inside they're data typed. The variables have types, it's the values that have the types, not the variables that have the types, it's a subtle distinction, but it's important.

So, for those languages, one of the things that is very, very hard is when you call the add method, you don't know what data types it's getting, the compiler doesn't know, even the runtime MIB Compiler doesn't know. You don't know until runtime, when you look at the type of two arguments and say, okay, the first one is integer, the second one is a string, what do I do to add them?

It's called method dispatch. Writing an effective method, a very, very efficient method dispatch where you don't know the types in advance is difficult in the current system. So we have a proposal for some stuff we can do to make that much more efficient and we'd like to see that happening in the future too.

In your answer you said, "proposal". You were making a proposal. To whom? To whom are you making these proposals? Is this to the commitees? Is this to Microsoft? Is this to Bill?

Internally to the CLR team. That's an interesting question. How do these things relate? We, Microsoft, tends to propose things internally, check them with the programming language teams and a few external programming languages that we work with closely to make sure that the features make sense. Make sure that we know how to implement it. Make sure that we think it will actually improve performance. We'll do a prototype of it in-house.

Once we're sure that it makes sense and really would improve performance, and are fairly certain what the specs ought to be, then we submit it to ECMA to be standardized.

Microsoft does not submit something to ECMA standardization until we're fairly sure that we're going to put in the product. So, it would be embarrassing if the standard standardized something and then we decide, for good and legitimate reasons, on the product side, it won't work. There's no gain there.

So, obviously, one of the key elements of the CLR, is security? And as we've seen, with Bill Gates, security push, and so forth? Is the CLR incode access security the answer to Microsoft's security problems? Microsoft's and other's security problems, let me be fair?

Let me correct a couple things. Actually, one of the areas I am not responsible for in the runtime is security. It happens to be, however, a deep passion of mine, back from the 1980s when I was involved in designing and helping build one of the very first secure operating system. So, I've been involved with it there. I was involved with it at the World Wide Web Consortium, and I was involved with helping the design happen, but it's never been my actual responsibility. In fact, they have a very good security architect they recently hired who actually owns the problem.

CLR security isn't going to solve all the problems, but it solves two ones that I find extremely important. One is the problem of what do you do about something that we call semi-trusted? Something that you've just downloaded off of the internet, you don't know whether it's safe or not, you want to run it in a sandbox.

We don't want to go the route that, let's say, the Java community, but we don't want to go the same route, the sandbox is pre-built and you're stuck with what's in it We wanted to go with one where it's capable of being decided by the administrator. You have individual permissions that you can grant and it's a simple flexible system and it allows you to run a range of apps, given different permissions.

And that's the direction Java's at right now?

That's where Java's gone, that's were we've gone. Actually, this idea is not particularly new. When I was at the World Wide Web Consortium we had a major project on exactly this which we had both Sun and Microsoft working with us. It's a great idea.

And a trust management system in the whole business. So, that's one thing that you get out of the CLR. That because it's manage code, and because we have a MIB Compiler, it doesn't matter if it's an interpreter or MIB Compiler , whatever, but we have the technology that goes between the intermediate language, and the native code, we can insert security checks in that process. And we can change over time what they are and how they are and how they work. So there's a good deal of control that is there and we're really exercising that to a good degree. So that's one of the two things.

The other is we can get to a point where--ever since 1972, I wrote my first multi-user database system, we didn't use an existing database frame, we just wrote it from scratch. And what we discovered was that we wanted it to be the case that the database program we'd written could get at it's data files. But we didn't want the users to get at them. So we didn't want user based security because, me as a user, I never went go in and edit the file. I'd just corrupt the database, that wouldn't make sense. I wanted to give the program the authority to go in and change the database.

And you can't do that in traditional operating systems, even Multics which tried to do this, led a series of rings and stuff, couldn't do this because they gave it to the ring and not to the program. So you'd get your program into the ring and then any thing in the ring could do it--it was a different thing.

But with this code access security gives you exactly that. You can say, this piece of code has the right to access those files. Nothing else on the system does. So, to me, our security systems themselves solves two problems.

Now, it doesn't solve all problems, there's lots of other problems too. So, the work is going on, really, in the security system is to merge several different security systems that are there for very good, different reasons, into one, single thing that's administrable and make sense, rather than having just a myriad of them all over the place.

And a lot of that is happening in other areas as well. The versioning stuff as well, it's the same thing. We have ASP.NET had its way of doing versioning, SQLServer has this way of doing versioning, the runtime had its way of doing versioning, the COM system had its way of doing versioning, and part of that is also trying to coalesce those into one model. There's a lot of that going on, in several areas.

So, thank you Jim, for your time. And just as a parting shot, I just wanted to say, I'm thoroughly enjoying your book, "The Annotated CLI...", help me with the title again.

Jim: "The common Language Infrastructure Annotated Standard."

Ted: Thank you for your time.

Jim: "Okay."


News | Blogs | Discussions | Tech talks | White Papers | Downloads | Articles | Media kit | About
All Content Copyright ©2007 TheServerSide Privacy Policy
Site Map