Is Facebook Code Open Source

0 views
Skip to first unread message

Elgin Carmona

unread,
Aug 5, 2024, 12:45:35 AM8/5/24
to lesclowpzesi
Welaunched Rebound, a physics and animation library for Android, at our Mobile @ Scale event in October. Will Bailey wrote up the project in this recent blog post, and we believe that modeling real world physics is a powerful way to create natural and tactile animations and interactions within apps.

Web technologies remain relevant for Facebook too, on both mobile and desktop clients. On the front end, much of our open source focus has been on supporting our fast and flexible JavaScript library React, which we launched at JSConf in May.


HHVM, the HipHop Virtual Machine, is by far the most significant and followed project in our portfolio, and has enjoyed huge support from the PHP ecosystem. 2013 has seen almost 4,000 commits to the project, and great strides made in terms of performance and compatibility with third-party PHP frameworks, important for broader community adoption.


To help personalize content, tailor and measure ads and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookie Policy


The Python Static Analyzer (Pysa) is a static code analysis tool that ships as part of Pyre. Pysa performs Python source code analysis and uses taint analysis to identify potentially exploitable vulnerabilities within Python applications.


Taint analysis is designed to trace untrusted data through an application. It looks for cases where this untrusted data is used in potentially exploitable functionality within an application, such as SQL queries.


Taint analysis detects flows of data from user-controlled inputs (sources) to potentially exploitable functionality (sinks). Pysa is designed to use a collection of default and user-defined sources and sinks for Python taint analysis.


The taint.config file is a JSON file that stores the primary definitions for Pysa taint analysis. This includes definitions of the taint sources, sinks and rules. Sources and sinks are defined using the same syntax as Python 3 type annotations.


In Pysa, a rule defines a flow from one or more sources to one or more sinks that is of interest. For example, a Pysa taint.config file may contain a rule for SQL injection that specifies sources of untrusted input and an SQL query as a sink.


The other important type of Pysa configuration files is the model file. A Pysa model file (.pysa file extension) is used to annotate Python code with sources, sinks, sanitizers (functions that remove taint from data like hash algorithms) and features (additional metadata assigned to taint flows). Pysa has a number of built-in model files and users can define additional custom models. When performing annotation of a Python source code file, Pysa will use the union of all applicable model files.


One of the major limitations of source code analysis is that it can only see a subset of the code within a particular application. In the case of Pysa, this analysis is limited to the code in the code repository where Pysa is run and the directories explicitly specified within a .pyre_configuration file.


This means that Pysa is largely blind to the dependencies within a Python application. This is problematic because a dependency may contain unknown sinks or other functionality that impacts taint analysis (such as sanitizers).


Like most taint analysis tools, Pysa assumes that any function that it lacks visibility into and that has a tainted input produces a tainted output as well. While it is possible to explicitly label some functions as not transmitting taint, this is unscalable. As a result, Pysa can generate a number of false positive detections.


Additionally, all attributes of a tainted object are also considered tainted in Pysa. While this is good in some cases, it can also generate false positives. For example, if you have a tainted object tainted, then a reference to tainted.__class__ will also be labeled as tainted, generating a false positive detection.


This potential for false positives means that Pysa results should not be absolutely trusted and requires further analysis. However, the bias toward false positives (by over-labeling taint) is preferable to false negatives, where potentially exploitable vulnerabilities may be overlooked.


Additionally, Pysa cannot detect all potential vulnerabilities within an application, making it necessary to use other analysis techniques, such as dynamic code analysis tools, as well. Combining multiple static code analysis tools (like Pysa and pylint), dynamic code analysis and penetration testing helps to dramatically reduce the cybersecurity risk and exploitability of Python applications.


Howard Poston is a copywriter, author, and course developer with experience in cybersecurity and blockchain security, cryptography, and malware analysis. He has an MS in Cyber Operations, a decade of experience in cybersecurity, and over five years of experience as a freelance consultant providing training and content creation for cyber and blockchain security. He is also the creator of over a dozen cybersecurity courses, has authored two books, and has spoken at numerous cybersecurity conferences. He can be reached by email at how...@howardposton.com or via his website at


In some ways Facebook is still a LAMP site (kind of) which refers to services using Linux, Apache, MySQL, and PHP, but it has had to change and extend its operation to incorporate a lot of other elements and services, and modify the approach to existing ones.


PHP, being a scripting language, is relatively slow when compared to code that runs natively on a server. HipHop converts PHP into C++ code which can then be compiled for better performance. This has allowed Facebook to get much more out of its web servers since Facebook relies heavily on PHP to serve content.


A small team of engineers (initially just three of them) at Facebook spent 18 months developing HipHop, and it was used for a few years. The project was discontinued back in 2013 and then replaced by HHVM (HipHop Virtual Machine).


It has a ton of work to do; there are more than 260 billion images Facebook, and each one is saved in four different resolutions, resulting in more than 20 petabytes of data. And the scale is constantly increasing, with users uploading one billion new photos each week or 60 terabytes of data.


For example, the chat window is retrieved separately, the news feed is retrieved separately, and so on. These pagelets can be retrieved in parallel, which is where the performance gain comes in, and it also gives users a site that works even if some part of it would be deactivated or broken.


Hadoop is an open source map-reduce implementation that makes it possible to perform calculations on massive amounts of data. Facebook uses this for data analysis (and as we all know, Facebook has massive amounts of data). Hive originated from within Facebook, and makes it possible to use SQL queries against Hadoop, making it easier for non-programmers to use.


Facebook uses several different languages for its different services. PHP is used for the front-end, Erlang is used for Chat, Java and C++ are also used in several places (and perhaps other languages as well). Apache Thrift is an internally developed cross-language framework that ties all of these different languages together, making it possible for them to talk to each other efficiently at scale. It was developed at Facebook for scalable cross-language services development. This has made it much easier for Facebook to keep up its cross-language development.


Facebook has a system they called Gatekeeper that lets them run different code for different sets of users (it basically introduces different conditions in the code base). This lets Facebook do gradual releases of new features, A/B testing, activate certain features only for Facebook employees, etc.


Facebook carefully monitors its systems (something we here at Pingdom of course approve of), and interestingly enough it also monitors the performance of every single PHP function in the live production environment. This profiling of the live PHP environment is done using an open source tool called XHProf.


Not only is Facebook using (and contributing to) open source software such as Linux, Memcached, MySQL, Hadoop, and many others, it has also made much of its internally developed software available as open source.


Examples of open-source projects that originated from inside Facebook include HipHop, Cassandra, Thrift, and Scribe, React, GraphQL, PyTorch, Jest, and Docusaurus. Facebook has also open-sourced Flow, as static type checker for JavaScript that identifies issues as you code. If you are a JavaScript developer definitely check it out. It can save you hours of debugging time.


In 2024, there are plenty of open source chatbot platforms to choose from. The best one for you will depend on your chatbot-building needs - your experience, coding language, desired capabilities, and specific use case.


Open-source chatbots are messaging applications that mimic human conversation. Open-source means the original code for the software is distributed freely and can easily be modified.




Open-source software leads to higher levels of transparency, efficiency, and control through shared contributions. This allows developers to create software of higher quality while increasing their knowledge of the software platforms themselves.


Botpress is designed to build chatbots using visual flows and small amounts of training data in the form of intents, entities, and slots. This vastly reduces the cost of developing chatbots and decreases the barrier to entry that can be created by data requirements.




Botpress has a visual conversation builder and an emulator to test your conversations. The built-in JavaScript code editor allows you to code actions that can be used to perform specific tasks. The NLU module lets you define intents, entities, and slots. This is how your conversational assistant can understand the input of the user.



3a8082e126
Reply all
Reply to author
Forward
0 new messages