SafeInt at Boost?

Developer
Jun 21, 2009 at 1:51 PM

Hi David,

There's another discussion on SafeInt at the Boost developer's mailing list, started earlier today: [boost] SafeInt code proposal. A guy named Omer Katz proposed to add a utility to the Boost libraries that looks very much like your SafeInt! (But yours is superior, definitely.) Actually I'm surprised he doesn't give you some credits. (But maybe he’s just reinventing the wheel.) Anyway, I do think that it would be useful to have your SafeInt as a Boost library. It would very much help to increase its number of users. Boost has a very dedicated group of volunteering developers, which could review the class and possibly suggest further improvements. Furthermore, as a Boost library it would be tested automatically for many platforms: www.boost.org/development/tests/trunk/developer/

Of course, it would need some “restyling” in order to make it have a Boost look-and-feel. For instance, I guess it would need to be renamed, e.g., boost::safe_int. But it would still be your library, you’d still have the copyrights.

What do you think? I’m certainly willing to help you, to see if we can get your library into Boost.

Note that Boost users have been asking for a Boost version of SafeInt before:  [Boost-users] Portable SafeInt?

Jun 21, 2009 at 7:11 PM

Hi,

As I wrote on the Boost mailing list earlier, I had no idea such library was previously created by David and distributed freely.

We (my partner and I) first heard of David's library yesterday when it was mentioned in the replies to our proposal.

Indeed, his library is currently more advanced than ours. If David chooses to port his code into Boost, we will be more than happy to help.

Developer
Jun 21, 2009 at 10:43 PM

Hi Omer,

Thanks for your reply. I'm sorry I hadn't read the mail you posted last month, [boost] SafeInt library for Boost, otherwise I would have told you about David's library before. It might have saved you a lot of time! BTW, I do also reinvent the wheel from time to time, so that's no problem.  :-)

Kind regards, Niels

Coordinator
Jun 23, 2009 at 7:00 PM

I'd like to have some more information before commenting extensively. First, why should we do the work? I am not sure why someone couldn't just use the library as is - what benefit is there to the users by having it in Boost? I am not making the argument there is none - I am just asking because I don't know.

Next, what is involved in a port? I'd like to understand what is needed. I'm also wanting to try to keep the number of supported versions to 2 - this version, and the Visual Studio version. If I were to port it to Boost, I'd want to make that the only non-MS version.

Finally, the biggest issue for me is that I'd have to get the Microsoft legal department to look into any licensing issues. The license I have on it now is what they like to use, and as a lay person, it seems to allow what I want, which is anyone can use it, derive from it, and they aren't encumbered by any sort of onerous requirements. I do not know how boost does licensing - perhaps this is OK, perhaps they can handle the license that is on it now, but if Boost cannot deal with the license that is on it now, then I'd have to get legal review on any new license. This is all possible - I just have to ask for some meetings, start some mail threads, etc. It will take at least a few hours of work, and the simplest thing for me is to keep the existing license. It could possibly turn into a large hassle. If there is some conflict between the Boost license and the MS license, I'd be interested to know what that is.

If you'd like to take this to off-line e-mail, that's fine -

Jan 21, 2010 at 5:42 PM
Edited Jan 21, 2010 at 6:14 PM

Hi David,

I personally would love to see (any) SafeInt library included in boost, if this SafeInt library included in boost was yours I imagine it would save a great many people quite a bit of time (in terms of rewriting what your SafeInt implementation already does). I believe the issue is one of trust and perspective. I'm sure you're aware of how a great many open source advocates (and zealots among them) feel about Microsoft. While I can appreciate that you believe the MS-PL allows exactly what you want (and I can't argue, I can't speak legalese either), a lot of folks simply trust licenses that are provided by people or groups that have no monetary interest in their code a lot more. Was it you who, without oversight, chose to open-source this code or did Microsoft allow your request to open-source it? I might also point out that the mere fact that you would need to ask Microsoft's legal department about licensing issues with your own now-open-sourced code is a rather large red flag. Reading and rereading the first sentence of your third paragraph only made me question who owned the code. At least in theory (as I currently understand it from what I've read) that's you but the copyright notice on line 45 of version 3.0.12p says otherwise.

I do hope you'll consider proposing your SafeInt implementation to boost (and using boosts' license).

Geoff

P.S. Imagine the good publicity for yourself and Microsoft if this happened, not to mention the increase in eyeballs!

Coordinator
Jan 21, 2010 at 11:51 PM

I'd like to get it included in Boost myself. I've been really, really busy the last year - we're getting close to shipping Office 2010, so I've been working a lot. I think you can certainly use this with Boost code right now.

As to the license, let me clear up a few things - first of all, I Am Not A Lawyer. If you are using ANYTHING from the Internet, especially in a commercial product, you ought to get legal advice from a lawyer about what it means. Some licenses have some interesting restrictions, others have very mild restrictions. With that caveat, let me give you my interpretation of the license:

2. Grant of Rights

(A) Copyright Grant- Subject to the terms of this license, including the license conditions and limitations in section 3, each contributor grants you a non-exclusive, worldwide, royalty-free copyright license to reproduce its contribution, prepare derivative works of its contribution, and distribute its contribution or any derivative works that you create.

This means that you can use SafeInt, you can modify SafeInt, and you can distribute the results, whether that's your SafeInt or product that uses SafeInt. That's a very broad grant of license. You don't need to worry about us changing it - when we posted it in public with that license, that's what we gave you. And we agree (royalty-free) not to charge you if you do.

(B) Patent Grant- Subject to the terms of this license, including the license conditions and limitations in section 3, each contributor grants you a non-exclusive, worldwide, royalty-free license under its licensed patents to make, have made, use, sell, offer for sale, import, and/or otherwise dispose of its contribution in the software or derivative works of the contribution in the software.

If we had a patent on this (and we do not), then you'd have a license to use this work. Since we don't have a patent on SafeInt, this doesn't apply.

Now for the conditions:

A) This doesn't mean you get to use our trademarks - should be self-explanatory, and not a factor for most people.

B) Only applies to something patented, so doesn't matter here.

C) If you choose to redistribute my code, you can't say you wrote it.

D) If you redistribute this code in source form, the license has to stay with it.

E) We're not claiming the code is perfect. Use it at your own risk.

Don't trust licenses based on who they're from. Trust licenses based on _your_ lawyer's opinion of them. I've spent a lot of time learning about intellectual property issues, and am good at it for a non-lawyer, but if I have questions, I get a lawyer. I haven't looked at the Boost license in depth lately, but IIRC, the MS-PL license is _less_ restrictive, though the Boost license is one of the better ones. BTW, the various GPL versions can vary significantly in what they mean, so be careful with those.

We have no direct monetary interest in this code - we're not charging you to download it. We put it out here for free because we want the Internet to be a more secure place, and that's not just my position, but it is the Microsoft corporate position as well.

In the sense that I wrote it, this is my code. In the sense that Microsoft was paying me to write the code, it is really the property of Microsoft. I'm just the author. In order to post this code, which is very, very close to the exact same code as we compile into Office, Windows, and a bunch of other projects, I had to get VP level approval. If I want to post code that we don't ship, then that's much less restricted - see the offcrypto project for an example, the test harness and gcc header for this one would be other examples. Only makes sense - I wanted to make something public that is Microsoft's property, and we invested a _lot_ of time in getting it to production quality, testing, etc. If we're going to put that out there for free, I need to make sure it's OK with someone who has authority to allow it. If I'd posted production code to the Internet without being allowed to, that's not only ethically wrong, but it would likely get me fired. Because we want all of our customers, and even people who might not be customers, to be more secure, we distributed this in open source form.

I think you're misunderstanding my need to discuss this with our legal department. The problem is that we own the code, we have our MS-PL license on it, and if it goes out in a Boost library, it would then have a Boost license on it. These licenses aren't exactly the same. Maybe they can co-exist and both can be on it - not sure. Since I don't personally _own_ the code, I can't just swap one for the other. Now we have a legal decision to make, and the right way to solve that is to go talk to a lawyer about it. We have plenty of lawyers who will help me with this, I just have to find time. Doesn't mean we can't, but I do have to go sort out what it means. While I think the difference between the Boost license and MS-PL are minimal, something to look out for is that some licenses are contradictory - if you include the wrong combination of OSS software, you can get into a state where you can't meet the requirements of one without invalidating one of the other requirements.

Good publicity is always nice, but if you search on 'David LeBlanc security', you get over 500,000 hits, so I think I've had my 15 minutes of fame and then some. I'm not motivated by publicity for myself. I am motivated to help make people's apps be more secure - that's what matters, not my ego. In terms of positive Microsoft PR, yes, that's also nice, but not the driving factor. The important thing is to make people secure, and getting this into a more well-known library and possibly even into the standard library would help do that.

BTW, a Microsoft-specific version of SafeInt is shipping in Visual Studio 10. If someone reading this is programming only for the Windows platform, you should switch to that one when you upgrade libraries and the compiler. If your code is cross-platform, or you're on an earlier version of the compiler, stick with this one - I will keep maintaining it.

 

 

 

Mar 29, 2011 at 10:27 PM

Hi

I just would like to say that having SafeInt in boost would be very much appreciated. The C++ community would be grateful.

Many a companies have defacto  rules like "no 3rd party libraries allowed except boost". People trust the boost libraries just because the name is so well known and respected, even if they are a bit buggy sometimes.

Also it would be good PR for MS. 

Just my 2 cents.

Coordinator
Mar 29, 2011 at 11:31 PM

Thanks for reminding me - I think you could probably get an exception for this one, as you're almost certainly allowed to use libraries from Visual Studio, and this version is very, very close to the same thing as in Visual Studio. I do need to go have the discussion with legal people to see how this could work.

My focus just this moment is to get 3.0.16 up - I'm trying to get the warnings down, and get it to work right with the older compiler Apple likes to use.

Aug 8, 2011 at 9:41 AM

Boost has really smart people in its community, and sets ultra-high standards.  Libraries are peer-reviewed too, so you'd get a lot of good feedback on your code.  For example, I grabbed the latest SafeInt3.hpp, opened it up, and immediately saw this:

#if !defined NULL
#define NULL ((void*)0)
#endif
// These two may not be defined, either
#if !defined uintptr_t
#define uintptr_t size_t
#endif

#if !defined intptr_t
#define intptr_t ptrdiff_t
#endif
// Might need this for gcc and older Microsoft compilers
#if !defined nullptr
#define nullptr NULL
#endif

Code like that is evil.  The folks at Boost would surely let you know about it, and you'll learn modern C++ programming paradigms, improving the quality of your code.

I hope you can get the legal stuff out of the way.  Good luck!

Coordinator
Aug 8, 2011 at 6:22 PM

Could you perhaps explain how these are evil?

If this is really wrong, then there's a number of standard libraries that have bugs. There's 10 different headers in the Microsoft headers that have NULL defined, which is needed because it isn't defined as part of the language. A compiler and platform-independent library such as this one has to define anything that isn't part of the language standard, or it has to include standard headers that will.

Then uintptr_t and intptr_t are part of the language, but some versions of gcc that are supported by this class don't seem to know that these are keywords, and want them defined. If you have them defined, then these conditional definitions don't matter. Finally, nullptr is part of the newest edition of the standard, and this class is still supported on a number of compilers that are several years old, and you need that either defined as a keyword, or it needs a #define.

Thanks for your interest, and I'm happy to know that Boost could teach me more about modern C++ programming.

 
Aug 9, 2011 at 1:26 AM

Thanks for listening!  Of course, I'll explain.

The standard library is a special kind of library.  It's allowed to declare identifiers that are reserved.  This includes:

  • Names in the global scope that begin with an underscore
  • Names anywhere that contain two consecutive underscores
  • Names anywhere that begin with an underscore and a capital letter

That's why you see a lot of underscore usage in standard library headers.  This makes sure that the library's internal identifiers don't clash with everyone else's.  For this to work, though, everyone else must also not declare reserved identifiers.  So the standard libraries get to use names like __foo, _Foo, and foo__, while everything else gets to use names like foo, Foo, and foo_.

Where does that place non-standard libraries?  The Boost convention is to use namespaces.  Just as the standard library puts everything into namespace std, Boost puts everything under namespace boost.  Macros ignore namespaces, so Boost capitalizes all macros and prefixes them with the name of the library (e.g. BOOST_UNREACHABLE_RETURN).  The rationale for that is the same as for namespaces, which I'll get to in a minute.

In the first piece of code, you define NULL as ((void*)0).  There are two problems with this.

  • The user now cannot write char *p = malloc(n); because void * is not implicitly convertible to char *.
  • GCC has a special definition for NULL that allows the implicit conversion but gives a warning whenever one tries to convert it to a non-pointer type.

A conforming standard library implementation will define NULL when the header <cstddef> is included.  It's required by ISO C++.  Every sane implementation has it.

In the second example, you check for and define intptr_t and uintptr_t.  This is flawed because those two names are typedefs according to the language.  You can't check for them using the preprocessor.  Also, consider what could happen when GCC does define the names.  Your header would change the meaning of intptr_t and uintptr_t.  If you're lucky, the two definitions will agree; <cstdint> would say typedef ptrdiff_t intptr_t; typedef size_t uintptr_t;.  However, if the library uses some other types, bad things could happen.  Code that uses intptr_t and uintptr_t would then change simply because they included your header.

Macros are generally avoided in modern C++.   They replace all occurrences of the name regardless of where they are.  For example, the following code...

namespace boots
{
	typedef std::intptr_t intptr_t;
}

...will be preprocessed into...

namespace boots
{
	typedef std::ptrdiff_t ptrdiff_t;
}

It's like doing a find-and-replace-all in your code, the standard library's code, and everyone else's code.  Macros are evil.  If there's no way to do what you want without macros, then use them, but follow the naming convention so that users won't unknowingly have their code changed.  Likewise, don't declare identifiers that are in all capital letters that aren't prefixed by your library name, so that the user's macros won't change your code.

In the last piece of code, you check for and define nullptrnullptr is not a macro.  It's a keyword, and, like typedefs, you can't check for keywords during preprocessing.

Generally, if a feature isn't available in a particular C++ implementation that you need to support, don't use it.  In place of nullptr, you can use NULL.  If there's no replacement, like with intptr_t, then use your own definitions, but put them in your own namespace (or prefix their names if they're macros).  Namespaces prevent conflicts with user code and other libraries.  You shouldn't define anything outside of your namespace unless you're absolutely certain that the definition is, and will be, unique.  Remember: In any translation unit, a template, type, function, or object can have no more than one definition.  Imagine every library defining the std::vector!

To reiterate,

  • Standard libraries use names like __foo, _Foo, foo__, and _FOO.
  • Non-standard libraries use names like boots::foo, boots::Foo, boots::foo_, and BOOTS_FOO.
  • Users use names like foo, Foo, foo_, and FOO.

Thanks for taking the time to read this.  I'm a passionate C++ programmer who loves using libraries to ease my work.  But it's frustrating to see libraries trample over each other.  And it's impossible for me to review all the library code by myself.  You might know Stephan T. Lavavej--he's a developer of the Visual C++ standard library implementation.  He's really good with words.  Maybe he could give you some tips.

Coordinator
Aug 9, 2011 at 3:55 AM

OK, well, first of all, you can't do:

char* c = malloc(1);

to start with - it emits this:

zz.cpp(5) : error C2440: 'initializing' : cannot convert from 'void *' to 'char*'
        Conversion from 'void*' to pointer to non-'void' requires an explicit cast

That's completely independent of whether someone has defined NULL.

Next, gcc is doing something a little non-standard, by treating NULL the same as nullptr, which is why nullptr exists. That's fine, and the expectation is that the user has almost certainly already defined NULL, so anything I do in this class won't matter. You see the same definition all over many other headers. Note that I don't bother to make it conditional on C++, since this class obviously isn't compiling on a C file.

Next, as it turns out, I don't need intptr_t, and the instances of uintptr_t can be replaced with size_t. Regardless, uintptr_t and size_t, and the corresponding signed values are guaranteed to have the same size and behavior. Replacing one for the other won't cause any harm, even if it is regrettable. I am well aware of the evils of macros, and have largely tried to avoid them any time possible. I'm also well aware of how to use namespaces, and you may note that these are used internally to the class. At the moment, this version of the class isn't part of an external set of libraries, and thus it would be inappropriate (and clumsy) to add namespaces. There are namespaces defined in the Visual Studio version, which I also maintain.

I also don't think I've named anything with __, except to follow what Visual Studio has done with __int8, __int16, etc. While that may not be something you're happy with, I can count on these being present in my primary target compiler. That one is a "won't fix".

Yes, I do know Stephan. He's certainly a very smart guy, though I did report some integer-related bugs in his STL allocator code.

While we're giving tips, you're more than welcome to report bugs. I'll certainly fix the uintptr_t and intptr_t issues - it was a surprise that my Linux system wouldn't compile when these types were used, and you're right that the defines aren't working as expected. However, please do not use words like 'evil', and it is especially silly to lecture me on modern C++ techniques when you're looking at a class that uses partial template specialization extensively. I'm clearly acquainted with modern programming techniques.

Here's what I would have preferred to see:

Issue - uintptr_t and intptr_t shouldn't be defined. intptr_t isn't used, and size_t would work just as well where uintptr_t is used.

You're also looking at nearly 7000 LOC that have evolved over the last 8 years. It would be a miracle if everything were perfect.

 

Aug 9, 2011 at 4:31 AM

I'm sorry, I made a mistake in that example.  I meant char *p = NULL;.  Can't believe I missed that after proofreading it dozens of times.  :-(

I'm sorry if I came off harsh.  That was certainly not my intent.  I used the word 'evil' because Stephan uses it to describe such things, and I put a lot of effort into the examples because I saw some questionable code and wanted to show you exactly my sentiments.  I respect you and your work.  Please don't take it personally.  I'll shorten my explanations from now on.

Congratulations on keeping this project going for so long!  I saw your demo of SafeInt on Channel 9 and came here to check out the library.  I'm very grateful that there's such a mature checked-integer library available for free.

Coordinator
Aug 9, 2011 at 6:37 PM

That's an interesting issue with the NULL declaration - oops - my bad. I was surprised that my gcc environment didn't have it defined, and then I got the two forms of the definition flipped.

Stephan and I have gotten into some debates, largely because he works only on a library, where I write features, and work on a wide array of code - some is high level C++, some is C written like you might have seen 20 years ago. He tends more towards what's theoretically correct, and I tend more toward the practical. For example, it is his position that we should never allocate an array of bytes, but always use  a std::vector. He's right that the vector is safer, but go try to take apart a TCP/IP header using a vector - wouldn't be fun. Might be possible, but I'm not sure whether anyone coming after could figure out what it was doing. That said, I do use std::vector for many cases where I need an allocated array - an iterator (at least a current implementation) and the [] operator will crash rather than create an exploit. It's also a matter of absolutes - it isn't that we should _never_ call new[], it is that we should prefer std::vector.

Same sort of thing with macros - there are things you can only do with macros. I don't like them, try not to use them, and often rip them out and turn them into inline functions, but they're not evil - just a dangerous tool (just like the rest of the language).

The issues you pointed out were good ones - changes made as a result:

NULL (if needed) is defined as 0
nullptr is only defined if you define NEEDS_NULLPTR_DEFINED 1
intptr_t define is removed
uintptr_t instances changed to size_t, define is removed

An intentional design decision was to reduce the number of headers required by this class to the least possible. A side-effect is that depending on the environment, you don't always see everything defined that you expect.

Issues I'm not going to address at the moment -

The __int8, __int16, etc, are artifacts because the library was originally written for Visual Studio, and I can use them there without including any headers. They're nicely descriptive, which is handy in a world where 'long' has different meanings on different compilers and architectures. So that might be a technical violation of the language, but you can blame the Visual Studio team - I'm just following them. I could go rename them all, but it would make getting a diff between versions a major pain. So that's a won't fix.

Greater use of namespaces - there's different valid arguments here. Some of the guts of the class work nicely as standalone utilities, especially IntTraits (yes, I know Boost has an analogue). Wrapping these in a namespace would be inconvenient. However, we made the opposite decision with the Visual Studio version. Using a namespace for the overall class seems inappropriate unless it is part of a larger library - again, the Visual Studio version does get wrapped in a namespace, along with a number of other utilities. I'd consider putting more of the implementation into a namespace, but it seems less important than say getting the throw decorations correct.

Thanks for the input, and there's no offense taken.

Aug 9, 2011 at 9:51 PM

I believe GCC adheres to the standard with NULL, so it's not exactly treated the same as nullptr.  It just generates warnings where nullptr would generate errors.

I often favor the theoretically correct myself.  I'm not sure why using a std::vector for safety checks would be a theoretical argument, though; the standard doesn't mandate the safety checks.  It seems more of a practical one, since theoretically our code is error-free while practically our code has buffer overflows and the std::vector catches them.  Can I have an example where a std::vector can't easily replace operator new[]?

I agree that macros are a necessary part of the language.  It's just that many programmers still use macros inappropriately.  I couldn't assume you knew about them.  Programmers have surprisingly fragmented knowledge of C++, so someone who knows how to write template specializations but doesn't know how the preprocessor works doesn't sound far-fetched to me.  Just look at the code of various open-source libraries other than Boost.

I don't understand why you need to define nullptr.  Can't you use NULL in your code instead?  Remember, nullptr isn't the same as NULL.  And you'll be making the user deal with errors when their compiler doesn't support it.

I know the __int8 and __int16 stuff would be difficult to remove.  I guess you weren't planning on this library outgrowing MSVC when you started it.  If you could do it again today, you'd use typedefs, right?

I'm not as worried about the namespaces as I am with the macros.  But, of course, other issues take priority over both of those.  Identifier conflicts would probably cause compile errors anyway.  If nobody's complained yet, then your macros have good chances of uniqueness.

I'd submit issue reports like you prefer, but unfortunately my time is extremely limited.  Having your code reviewed by experts is mainly what I'm going for.  Boost developers can find and discuss bugs with you just like I have.  Even if your library doesn't make it into Boost, the review process would be a nice way to improve it.

Coordinator
Aug 10, 2011 at 6:35 PM

I don't have time to pull up the headers to show a concrete example, but an IP header is a variable-length structure that has a minimum of 20 bytes. So you have to calculate the offset to the TCP header, and you end up using pointer math to do so. I could allocate the bytes using a vector, but the only gain I'd get out of it would be to have it clean itself up, and I'd take a hit of some overhead in what's typically highly perf-sensitive code. If I can't realistically use iterators, then converting the vector to a raw pointer really doesn't gain me much in terms of safety.

Another example that I'd really rather not attempt to use a vector with would be a security descriptor on Windows - variable-length structs followed by more variable length structs. You will run into the same issues with most images, especially WMF or EMF, and to some extent bitmaps. So Stephan's point was that one should _never_ call new[] - mine is that it should be called as infrequently as possible.

I know a fair bit about macros, though what did escape me and led to this mistake was that you can't check for a keyword with a #if defined. I should have tested exactly that piece of code. I've stepped through literally every single line and return case in the rest of the class, but defines and such are somewhat more annoying to debug.

You're right that many programmers have a very fragmented knowlege - ran into a guy once that didn't know how to use sprintf, and went though hideously complex C++ hoops to do the same thing. You're also right that there's an amazing variance of errors and brilliance out there - I work on Office, which is over 100 million LOC, and I think every programming approach can be found in there somewhere - partially due to the fact that some of it is 20-25 years old.

I'm sorely tempted to go back to NULL. That's what I get for attempting to use new features. I only actually need it in one function. Other instances of NULL occur in some of the pointer overloads.

No, I wasn't planning on it being used on other compilers at the time I started. If I were to do it over, I'd probably use the types set up in stdint.h.

To be honest, this code has been reviewed by a large number of experts. Some find interesting issues, others don't. The most effective thing that we've done is to create a rigorous test harness. It's one thing to look at code and think it looks good, quite another to _prove_ it works - we caught errors created by the Intel compiler this way, and then proved that we'd coded around them. Most of what's left are things like you caught, where there's no runtime effect, just playing nice with other libraries and code. The remainder of what's left is debatable issues, like how much of the guts to enclose in a namespace, one file or two, whether to put the whole class's external interfaces in a namespace, etc.

I should still look into getting it into Boost, though I think the legal stuff is the bigger hurdle. We have a perfectly fine license, they do too, not sure why everyone can't just get along... It would be one thing is one license were restrictive, but neither of these are.

Aug 10, 2011 at 11:27 PM

This should work:

void handle_packet(vector<char> packet)
{
    // determine length of IP header
    uint32_t word_1;
    packet.at(4 - 1);
    memcpy(&word_1, &packet[0], 4);
    word_1 = ntohl(word_1);
    auto length = 4 * (SafeInt<uint32_t>(word_1) << 24 >> 24 >> 4);
    if (length < 20)
        throw some_error();
    packet.at(length - 1);
    // get iterator to data section
    auto data = packet.begin() + length;
    if (data != packet.end())
    {
        // do stuff with data
    }
}

The only reason to use operator new[] that I can think of is optimization, though vector is so fast in VC10 that I can't see it happening much.  I would store variable-length structs using malloc.  operator new[] seems inappropriate if I'm circumventing the type system anyway.

I know it sucks giving up nullptr.  I love it myself.  NULL was a mistake to me.  But I've tried shoehorning nullptr into my program when GCC didn't have it, and it wasn't easy.  std::function needed an overload for nullptr_t, for example.

stdint.h isn't available in VC9.  I wanted to see if you would add typedefs for an extra layer of portability, like:

// assume we have <stdint.h> by default
#define SAFEINT_HAS_STDINT_H
// MSVC versions earlier than 10.0
#if defined _MSC_VER && _MSC_VER < 1600
#undef SAFEINT_HAS_STDINT_H
#endif  // defined _MSC_VER && _MSC_VER < 1600

#ifdef SAFEINT_HAS_STDINT_H
#include <stdint.h>
#endif  // SAFEINT_HAS_STDINT_H

namespace safeint
{
#ifdef SAFEINT_HAS_STDINT_H
    using ::int8_t;
    using ::uint8_t;
#elif defined _MSC_VER
    typedef __int8 int8_t;
    typedef unsigned __int8 uint8_t;
#else   // defined _MSC_VER
#error need fixed-width integer types
#endif  // defined _MSC_VER

    template<class Ty>
    class safeint
    {
    public:
        operator int8_t() const;

        operator uint8_t() const;
    }
}

That's how I would do it.

By the way, the line endings are inconsistent in 3.0.16.

Coordinator
Aug 11, 2011 at 2:09 AM

Heh - here's how you really do it:

unsigned long GetHeaderLen(const char* pPacket, unsigned long cbPacket)
{
   ip* pIp = reinterpret_cast<ip*>(pPacket);
   if( pIp == NULL || pIp->ip_v != 4) return 0;
   unsigned long len = pIp->ip_hl * 8; // int overflow not possible, ip_hl can be at most 15 - only 4 bits
   return len > cbPacket || len < 20 ? 0 : len;
}

Lots easier and no overhead. You _can_ do nearly anything using STL, the question is whether maybe plain old C might be a more efficient choice (and sometimes more readable)

Usually Visual Studio fixes up line endings - have to look into that.

The point about the header is why I try as much as possible to avoid depending on them (for this sort of project). Given some number of compilers that are supported on some number of platforms, the more headers, the more chance for headaches.

Aug 11, 2011 at 3:45 AM

It really depends on the requirements of the application.  In most cases, the overhead will be negligible.  Without solid profiling, I can't justify the security tradeoff.  But if you have proof that the vector has too much overhead, then of course you shouldn't use it.  Otherwise, it would be premature optimization.

There's a way to get the fixed-width types without the header by using limits.h instead, but practically only Boost needs that kind of portability.

Sep 24, 2011 at 9:28 PM
Timothy003 wrote:

.... In most cases, the overhead will be negligible.  Without solid profiling, I can't justify the security tradeoff.  But if you have proof that the vector has too much overhead, then of course you shouldn't use it.  Otherwise, it would be premature optimization.

Seriously? A program has to be correct before anything else.

Out of curiosity, how can you be certain the attacker does not control the input?

Sep 24, 2011 at 9:44 PM
noloader wrote:
Timothy003 wrote:

.... In most cases, the overhead will be negligible.  Without solid profiling, I can't justify the security tradeoff.  But if you have proof that the vector has too much overhead, then of course you shouldn't use it.  Otherwise, it would be premature optimization.

Seriously? A program has to be correct before anything else.

Out of curiosity, how can you be certain the attacker does not control the input?


Are you implying that dcleblanc's code is safer and more correct?  Other than relying on undefined behavior with the type punning, there's already a bug in it; it doesn't validate the packet's size.  With vector, iterator debugging will give you a better chance of catching incorrect code.

The two examples don't do the same things, anyway.

Sep 25, 2011 at 9:33 PM
Edited Sep 26, 2011 at 3:26 AM
Timothy003 wrote:
noloader wrote:
Timothy003 wrote:

.... In most cases, the overhead will be negligible.  Without solid profiling, I can't justify the security tradeoff.  But if you have proof that the vector has too much overhead, then of course you shouldn't use it.  Otherwise, it would be premature optimization.

Seriously? A program has to be correct before anything else.

Out of curiosity, how can you be certain the attacker does not control the input?


Are you implying that dcleblanc's code is safer and more correct?  Other than relying on undefined behavior with the type punning, there's already a bug in it; it doesn't validate the packet's size.  With vector, iterator debugging will give you a better chance of catching incorrect code.

The two examples don't do the same things, anyway.

# Are you implying that dcleblanc's code is safer and more correct?
Yes, I believe LeBlanc's SafeInt code is safer in general.

If the time ever comes, I believe ignoring wrap and overflow will be grossly negligent. The industry enjoys software patents, it should also enjoy liability.

# Other than relying on undefined behavior with the type punning...
My bad, I was commenting in general, and not on the particular examples.

For what its worth, the Visual Studio compiler is more tolerant of punning. I think VS just does the right thing and gets the alignments right (with no warning). There is some hand waiving on the way the code is fixed up at runtime, though.

# With vector, iterator debugging will give you a better chance of catching incorrect code.
I love debug builds and asserts (I'm a student of John Robbins), but this is a problem in practice. To convince yourself, install libboost via apt-get or yum and define _GLIBCXX_DEBUG (http://gcc.gnu.org/onlinedocs/libstdc++/manual/debug_mode.html). I recently [incorreclty] filed a report against Boost::Regex (it was an ABI problem).

I know I could have built the source from scratch. But Boost has broken the tried and true configure/make/install in favor of something else buried in a spaghetti of documentation. I could be wrong, but I had no joy in building on Solaris (a Boost bug report was filed, no acknowledgement to date).

Jeff

Sep 25, 2011 at 10:57 PM
Edited Sep 26, 2011 at 3:18 AM
Timothy003 wrote:

Boost has really smart people in its community, and sets ultra-high standards.  Libraries are peer-reviewed too, so you'd get a lot of good feedback on your code.  For example, I grabbed the latest SafeInt3.hpp, opened it up, and immediately saw this:

#if !defined NULL
#define NULL ((void*)0)
#endif
// These two may not be defined, either
#if !defined uintptr_t
#define uintptr_t size_t
#endif

#if !defined intptr_t
#define intptr_t ptrdiff_t
#endif
// Might need this for gcc and older Microsoft compilers
#if !defined nullptr
#define nullptr NULL
#endif

Code like that is evil.  The folks at Boost would surely let you know about it, and you'll learn modern C++ programming paradigms, improving the quality of your code.

# Boost has really smart people in its community, and sets ultra-high standards.
I'm not sure about the ultra high standards.

# Libraries are peer-reviewed too, so you'd get a lot of good feedback on your code
That  Boost will accept anything and hope that someone will [eventually] spot an error is a detriment, not a feature. For example, it appears Boost accepted threads and rolled them out without rigorous review or acceptance testing. Distributions generally don't backport bug fixes, so downlevel users are stuck with defective code.

On my cursory review, Boost::Threads did not meet production quality (I filed about 15 bugs against it). For example, ignoring return values from WaitForSingleObject and pthread_mutex_lock was egregious. In addition, the subproject suffers resource leaks on negative code paths and race conditions.

Ignoring return values is something I would expect from a self-taught, l33t, K&R coder. I don't think a CompSci freshman would make the same mistakes. And I don't want code like that in my code base.

It also appears Boost does not audit the bug database. To date, none of the bugs have even been acknowledged.

# Code like that is evil. The folks at Boost would surely let you know about it...
Have a look at Boost::Test and its use of macros sometime. If you are offended by LeBlanc's three uses, you will be horrified with Boost::Test.